Specman Verification
Verification engineer? Verifigen has the right project for you!

The Scoreboard (or Data Checker)

Introduction

Automatic checkers are usually divided into two categories:

1. Temporal checkers - These checkers constantly monitor the DUT's external and internal interfaces, and make sure that they do not deviate from the specified protocol rules. For example, they should check that data is never sent over an interface when ready is not asserted, or that a request is always granted after a finite number of clock cycles. Temporal checkers are concerned with form rather then content. They check how data and controls are passed from one block to another, but they do not check their actual content. The data of a packet might be all wrong but as long as it is sent according to the interface protocol rules, the temporal checker is satisfied.

2. Data checkers - These checkers verify that the actual data that is sent by the DUT is correct, and complies with the specification. For example they should check that the data that is read from a specified register/counter is as expected, or that the fields of a packet (destination address, source address, headers, crc) have all been assigned with the right values.

Data checkers operate on a higher level then temporal checkers. The data they work on is logical data (the actual logical fields of a command - opcode, data) rather then physical data (lists of bits and bytes). The data checking is done sometime after a logical unit of  data (packet, command, register, memory segment) has been collected from the bus and not after each physical unit of data (bit, byte, word) is collected. Unlike temporal checkers, they do their checking in zero time - one logical structure is compared with another and if a difference is found an error is issued. In fact, they couldn't care less about time...while for temporal checkers the relative timing of events is almost everything.

In non OO languages such as VHDL or Verilog where only basic data types exist (bit vectors of variable size), data checking could only be done on the raw physical data that is taken of the bus. You can of course store the data somewhere (in an array or large bit vector) but even if you manage to do that, you will not be able to check the data as logical data, but only as physical data. That is, the actual checking would always consist of a comparison of two bit vectors, and not of a comparison of two data items. When an error is detected, the typical error message would be "bits 58 to 68 are different then expected" and not "the destination address is different then expected", which is far more comprehensible. Since, in non-OO languages logical data structures and even dynamically sized arrays or list do not exist, storing data until a full logical unit of data arrives on the bus, has no significant advantages. That's why you would often find a code that does both the data checking and the temporal checking. Each byte is checked at the moment when it appears on the bus. Such a piece of code is referred to as a "cycle-accurate model" of the DUT's behavior, because it reflects the exact behavior of the DUT on a cycle by cycle basis. It is often an exact copy of the DUT implementation. Apart from being so annoying and boring to write, it also requires constant maintenance, since the checker code is tightly coupled with the DUT code, and every small change in the DUT requires a corresponding change in the checker (even such insignificant changes as increasing output delay by one clock cycle for example).

HVL's such as Specman-e, SystemVerilog or Vera, on the other hand, are OO languages, and therefore enable the programmer to create complex data types and logical representations of data. In such languages data checking can be done on a logical level (a packet against packet) rather then on the physical level (bit vector against bit vector). These languages enable us to completely separate the checkers for the physical level - the temporal checkers, from those of logical level - the data checkers. Such a separation makes sense because the interface protocol rules have nothing to do with the data that is transferred over the interface. It also facilitates reuse, because each level can vary independently of the the other. You might replace an IDE bus with a PCI or a SCSI bus and leave the overlaying logical data intact, or you might use the same IDE connection for two completely different types of data.

With Specman-e, data checkers and temporal checkers, have each their own dedicated language constructs, and are usually placed in separate logical units and separate files. Temporal checkers  use temporal expressions (Specman-e's built in assertion language) to monitor interfaces. Data checkers use lists, queries on lists, and some extremely handy methods like "deep_compare()" and "deep_compare_physical()" to store data, search for data and compare data items to each other on a logical level. Temporal checkers are usually placed in "drivers", "collectors", "BFM"s (Bus Functional Models - a non synthesizable model that imitates the behavior of a bus) or "agents" - in one word, in all those elements that inject data, collect data and monitor physical interfaces. Data checkers usually have their own dedicated files.

The diagram below shows a typical scoreboard. The scoreboard gets the DUT logical input data (i.e. packets, controller commands) either directly from a data generator (A), or from the physical interface (B) in which case it has to be reconstructed into logical data (C). Once it knows the logical data that went into the DUT it calculates the expected output logical data. In many cases, for example, when the DUT's response depends not only on the applied stimulus but also on internal state, predicting the exact output data is not a trivial task. The actual output data is fed into the scoreboard from the physical interface on the DUT output side (D). This data also has to be reconstructed into its logical form (E).

 

Where should be DUT input data be taken from?

The first thing to consider when writing a scoreboard is where to take the input data from. This is one of those eternal questions that almost always pop up when you’re designing a testbench. The input data can usually be taken either directly from the generator or from a monitor located inside your agent or BFM.

 

Choosing between these two options depends on your answers to the two following questions:

A. Is it guaranteed that your scoreboard will always be able to take data from the generator? If you are writing a block level/module level environment, there is a good chance that you will want to reuse your scoreboard at the top level as well. That is, you will want your scoreboard to remain active during tests that are run at top level. However, if your block becomes an internal one at top level, your stimulus won't come from the generator but from other DUT blocks. If you take data from the generator, your top level reusability will be limited, and its highly probably that you will have to rewrite large parts of your code. The following diagrams demonstrate the problem:

B. How complex are the data structures that your scoreboard has to check? Newbies might find this question a bit vague. However, as a rule of thumb it might be said that the more logical fields, and the more "when" subtypes that are determined by those logical fields you have, the more complex your data is. If your data is relatively simple then it means that you might pretty easily reconstruct it from the list of bits/bytes that you normally get from your monitor. If, for example, your data is made out only of physical fields, then reconstructing it will be automatically taken care of by unpack(), and will not necessitate any extra code on your part. On the other hand, if the exact structure of data is determined according to the combination of values of several fields, some of which only conditionally exist, you will soon find yourself extending your do_unpack(), with large amounts of code in order to decipher the bits and bytes you have at hand. The following example will give you a little taste of that:

<'
// To distinguish type A from type B 
// we have to check if there's a 
// termination byte equal to 0xff. 
// In an A packet all bytes after the
// da/sa fields will be smaller than 0xff.
// In a B packet at least one of the 
// bytes after the da/sa fields will be
// equal to 0xff

type PacketKind : [A,B];

struct Packet {
   kind : PacketKind;

   %da : uint(bits:48);
   %sa : uint(bits:48);
}; // struct Packet

struct BHeader {
   dataSize : uint(bits:3);
   keep dataSize < 5;
   %data[dataSize] : list of byte;
   %termination : byte;
   keep termination == 0xff;
};

extend B Packet {
   %bHeader : BHeader;
};

extend Packet {
   %payload : list of 

byte;
   keep for each in payload {
      it != 0xff;
   };
};

'>

if we take the data from the monitor
we will have to reconstruct it through
a custom extension of do_unpack()

<'

extend Packet {
   do_unpack(options:pack_options, l: list of bit, begin: int):int is first{
      var bHeaderBytes : list of byte;
      unpack(options, l[96..96+8*5], bHeaderBytes);
      
      kind = (bHeaderBytes.has(it == 0xff)?B:A;
      // dataSize must be limited or we'll get
      // a run time error during unpacking
      // because Specman won't know the size
      // of the data list.
      // since we're using a type qualifier
      // to access a conditional field, we 
      // must put "me" at the beginning or
      // we'll get a compile error.
      me.B'dataSize = bHeaderBytes.first_index(it == 0xff);
   };
};
'>

Now if this additional piece of code looks harmless enough, that's only because I took pity of you, and of myself, and my example doesn't mirror the horrible staff you'll see in real life. But take my word that when you deal with data made out of multiple layers, each of which has to be parsed (like the layer added by the B packets in the example above), things could soon become real ugly. Also, note that in order to write your custom unpack you sometimes have to specify exact bit locations. That kind of code brings you just one step too close to the HDL you're checking, i.e. it creates a very tight coupling between testbench and DUT code. If a bit moves one place, you might have to change your testbench to match. 

If you are just saying to yourself : "Well, what do I need to unpack the data for? I can just have my scoreboard compare bits and bytes!", then get up and slap yourself twice. An error of the type "bit 66 is different then expected" is just impossible too quickly understand and debug. You must provide the designers with more or less meaningful errors. An example of a meaningful error would be "sa is different then expected" or "payload byte 0 is different then expected". Once your high level data is reconstructed from the bits and bytes, you can use the built in e method "deep_compare_physical()", that will automatically give you nice errors of this type.

To sum it all up, if you're doing top level simulation and your stimulus data is complex, consider taking data from the generator. If your data is simple to reconstruct or your block is about to become an internal one in a bigger design, it would probably be better to take it from the monitor.

Handling DUT output data

Unlike the case of DUT input data, where data can be sometimes taken from the generator, in the case of DUT output data, you always have to collect physical data from the bus and then to reconstruct it into its logical form. Now, some of you are probably thinking that if you end up having to reconstruct data anyway, why not do it at the input as well? In other words, what's the point of taking data from the generator to save reconstruction at the input, if you have to do it for the output anyhow? In fact, that's a very reasonable question, which could be given two answers.

A bullshit answer would be that input data and output data could have entirely different formats, and therefore code that you use to decipher and reconstruct the one will not do for the other. While this sentence is entirely true, it is still a bullshit answer to the question, because 90% of the scoreboards are either written for data chips, and then you have two more or less symmetric transmit and receive directions, or they are written for cases in which input and output data have the same format.

A better answer would be that when you are reconstructing the DUT output data, you can sometimes use the expected data as a hint in order to make reconstruction easier. For example, you can assume that the data you have just received has the same structure (same fields, same list sizes etc.) as the expected data, and then open it up according to that structure. Of course this doesn't amount to more than a guess, but at least its an intelligent one, because once you are out of the initial testing phase, you are much more likely to encounter small errors (field is wrong) than major ones (all structure is wrong). Of course, your scoreboard will still detect major errors as well, its just that the error messages it gives would be somewhat difficult to understand, and would probably require some intervention on your part to help the designer understand what went wrong.

The following example shows a scoreboard that uses the expected result as a hint. The DUT is supposed to change the packet da field, based on its type, and to leave the packet otherwise untouched.



simple example of a scoreboard that uses
expected data to open up output data.

packet conversion function (DUT model)

<'

extend Packet {
   convert() : Packet is {
      result = new;
   };
};

extend A Packet {
   convert() : Packet is also {
      result.da += 1;
   };
};

extend B packet {
   convert() : Packet is also {
      result.da += 2;
   };
};

'>

at scoreboard

<'

unit PacketSb {
   //...
   
   !expectedPackets : list of Packet;

   //...
   removeFromSb(physicalData : list of bit) is {
      var receivedPacket : Packet = new;
      
      // assume that the DUT is a fifo...
      // use expected parameters as a hint to
      // reconstruct physical data
      
      receivedPacket.kind = expectedPackets[0].kind;
      receivedPacket.B'dataSize = expectedPackets[0].B'dataSize;
      
      unpack(packing.low, physicalData, receivedPacket);
      
      check that 
        deep_compare_physical(receivedPacket, expectedPacket[0], 1) is empty 
        else dut_error(
         "Received packet : ", receivedPacket, "\n",
         "is different than expected : ", expectedPacket, "\n",
         deep_compare_physical(receivedPacket, expectedPacket[0], 10)
        ); 
   };

};

'>

Note that if the received packet structure is different then you expected (for example A instead of B), your error message will be somewhat enigmatic: it will show a list of different fields, but the list of the received packet will not be the right ones. Depending on your specification, it might be possible to detect cases where an item was reconstructed according to a wrong structure, and print a more comprehensible error message in this case. For example, this is how it might be done in our case:



to detect a case when the received data
was reconstructed according to the wrong
structure, we can preform a sanity check
on the received data.

<'

extend Packet {
   // return TRUE if sain
   sanityCheck() : bool is empty;
}; // extend Packet

extend A Packet {
   sanityCheck() : bool is also {
      // if payload has 0xff its a B packet
      return (not payload.has(it == 0xff));
   };
};

extend B Packet {
   sanityCheck() : bool is also {
      // if payload has 0xff its a B packet
      return (termination == 0xff);
   };
};

extend PacketSb {
   //..
   check that receivedPacket.sanityCheck()
   else dut_error(
    "Expected packet structure : ", 
    expectedPacket[0].kind, "\n",
    "But received a different one"
   );

   check that 
     deep_compare_physical(receivedPacket, expectedPacket[0], 1) is empty 
     else dut_error("..");

};

'>

And this is where I miss a nice SystemVerilog feature that allows you to check if a certain data item complies with all of the generation constraints...

work continues in this section, the sub-chapters below, will hopefully soon be added.

1. Making the scoreboard error proof, and capable of continuing execution after an error is detected

2. End of test routines.

Simulation End Control

When you start verifying a project, runs with "happy ending" are non-existent - all the simulations die young because of various DUT errors. With the advance of the verification process the average life expectancy of the simulations gradually raises, and you start getting very long runs. In the vast majority of cases, these long runs are useless, since the bugs they find will cost too much time to fix. Of course, almost any DUT has a limited number of states that will occur "naturally" only after several hours of simulation - error counters overflow is a typical example. In these cases it would be better to try and recreate these situations "artificially", for example by setting HDL signals from a specific test file, to values that are different from those they get upon reset. Long runs should be used only as a last resource, and only at the most advanced stages of the verification process. The days when I used to write HDL are long gone, but I can still tell that a bug that is found after more then 10 minutes requires iron nerves, especially when you don't get it first time right. A test that doesn't fail after 10 minutes should be elegantly terminated since it will not be of use anyway.

Having said that, the steps we should take seem simple enough: start a TCM that calls stop_run() (a global method used to end a test), after the approximate equivalent of 10 minutes real world time, in simulation time. The problem with this brutal method is that it stops everything on the spot, so the scoreboards will not be empty, and if you have some pending temporal checks, they will not have time to end. So, if you have injected a packet and it didn't come out, you wouldn't be able to tell whether this is due to some DUT bug, or due to the fact that the test has abruptly reached its end. The same goes for temporal checks - a write might not be acknowledged because the test had ended a cycle earlier than expected.

The ideal solution to this problem (from my point of view) would be one that gets the maximal simulation time (or better, maximal real world time), and stops the stimuli on each interface accordingly, so that each injected data item, would have the sufficient time to come out. Then we would know that when the test ends, all our scoreboards should be empty (that is, each injected data item had received some response), and if this is not the case, then this must be due to some unexpected behavior of the DUT. This ideal solution is not simple to implement, but god willing I will write it some day in a very elegant and generic way, and publish it on this site, for the benefit of all.

In the meantime, e has a built in solution, which is not quite ideal, but is still quite useful. Given a list of tasks that should be executed during a simulation run, e provides a way to stop any activity on the DUT interfaces after these tasks were executed, wait for a user specified time, and then end the simulation by calling stop_run(). You can for example, require that every test will transmit at least a 100 packets on each interface, and perform at least a 1000 CPU read/write cycles before it ends. After both of these tasks are done, the interfaces will go to idle state, and Specman will wait for a time that you determine before it practically ends the test. This solution, called "the objection mechanism" is described in detail in eRM document (see more below) so I will not go into it here. Its main disadvantage in my opinion is that it is not directly related to the simulation time, and that you should calculate by yourself how long you think that each task will take. Still, while you wait for my ideal solution, you can probably do quite well with that.

Coverage

The last block in our testbench is responsible for Coverage, more exactly, for functional coverage. As the name implies, coverage is a way to measure the part of a design that has been "covered" by tests, that is, the part of the design that the tests had actually checked. Functional coverage checks the DUT's functionality, i.e. that it actually behaves according to the specification in all situations. There are other types of coverage, the most common of which is code coverage, which checks how many times, if at all, the simulator had executed each line in the DUT's code. Unlike functional coverage, code coverage is "stupid" in that it is fully automatic and doesn't require any understanding of what the device is actually supposed to do. Functional coverage, on the other hand, requires a very good understanding of the DUT's expected behavior.

There are two main types of functional coverage. Input coverage should check if the generator is in fact generating all the types of data that one expects it to generate. For example, to check if an E model of a 8086 program RAM is generating all the available assembler commends. Input coverage should be taken seriously because of a curious behavior of Specman – when its generator fails to generate a certain combination, due to a true or imagined contradiction in your constraints, it will sometimes just generate an “easier” type of data without bothering to share this crucial information with you (see more in the generation tip below). Output coverage is there in order to tell you if you have in fact tested all the parts that you wanted to test in your design. In other words, when the output coverage is 100% you can send your design to the ASIC vendor and start packing for a vacation in Hawaii.

The output coverage should be based on a detailed document, written after the specification by the same writer, then passed over to the design engineer, who should look as best as he can for the weakest spots in his design and suggest how they are to be tested. There might be situations that seem to need testing, and yet, are very difficult to detect from the outside of the block or the chip. A good example of this are state machines, where you should check the behavior of your code in all states, but these states, are hardly visible from the outside. Therefore, output coverage is usually collected both from the outside, using the scoreboard in order to identify specific interesting situations, and from the inside, either by spying on internal signals, such as state machine registers, or in an end to end simulation, by relying on internal scoreboards.

Newbies to random testbenches often find it difficult to grasp the importance of functional coverage. I frequently get emails that ask me to explain the difference between coverage and placing breakpoints on some of the places you would like to check in a HDL code, which is more or less equivalent to using code coverage (and close to assertion-based coverage). I must admit it took me some time to find a simple and satisfying answer to this seemingly obvious question. It is true that you sometime can check if a random testbench generates a specific problematic/interesting test scenario (in other words, that it is doing its job properly) in either of these two ways. The major difference, however, is that placing a breakpoint in the HDL code obviously requires a very good knowledge of the HDL code, while only minimal, if any, such knowledge is necessary to write functional coverage code. If code coverage is used, then the design engineer himself will probably be needed to analyze the results and determine why a certain line has or has not been activated. So, functional coverage moves you one step ahead towards the ideal goal of having a design team and a verification team whose one and only meeting point is the design specification.

A Few Words on Methodology

As with any programming language, writing e code is a lot easier then writing e code properly. In this context the word "properly" means code that is efficient, readable, extendable, and reusable in parts or as a whole. Many techniques that are commonly used to write quality code in other languages, especially object oriented ones, are also applicable to e (for example those that are known by software architects as "design patterns"). In a way, writing good e code is both easier and harder then writing good code in other languages. It is a lot easier because an e programmer faces quite a small variety of "real world" problems,  in comparison with, say, a windows programmer. As shown just above, most testbenches share some basic, yet important, similarities. The fact that an e programmer knows in advance what the basic parts of the testbench will be, already makes things a lot less complicated. In part, this advantage is compensated for by e's flexibility, and its aspect oriented nature. Once you know what you are doing, flexibility is great, but beginners might find the wealth of options confusing, and this confusion will soon find its way into the code they write.

When I first published this tutorial, more than a year ago, this part contained a critic of Cadence's insufficient attendance to pressing methodological issues. Since the publication of the eRM (e Reuse Methodology guide), one has to admit, that this is no longer the case. The eRM contains helpful guidelines for coding some of the basic parts of a testbench (for example drivers and collectors), and recommended solutions to some of the most common problems that are encountered by testbench writers (for example: how to end a test? how to synchronize various testbench parts?). I still think that Cadence has not pursued the eRM project far enough, and that maybe they should have created a collection of base classes for the most common testbench parts (similar in a way to Microsoft Foundation Classes which contain numerous classes that no Windows programmer can do without), but this should not stop you from giving the eRM a through reading. Since the eRM refers to many problems that are encountered in almost every testbench, it is a highly recommended reading even for people who write testbenches in other languages. Unfortunately, this document is not published on the web, and is only available to Specman users. License-less readers can either (a) beg Cadence support to send them a copy (they might do) or (b) ask one of their licensed friends to send them one. Another quite recommended option is to check out the gradually growing part of this website that is dedicated to verification methodology. In this part I combine some of Cadence's recommendations, with the many lessons that I have learned from my own lengthy experience, to bring you my own code reuse guide. The verification methodology part will probably also be helpful if you have tried to read the eRM and have found it hard or difficult to understand, since I have this peculiarity of trying to explain everything in as simple words as I can. If you have read this far, this should come as no surprise...



Back-A Specman Testbench