Preface

Solid state hard drive If you must use FTL to do the logical address and physical address between the conversion, if the SSD read, write, delete and other normal work in the case of abnormal power off, it may cause a mapping table due to too late to update and lost There is a fault that the system cannot recognize the SSD.

At the same time, in order to improve read and write performance, generally use SDRAM to do the cache, if the read and write process encountered abnormal shutdown, SDRAM data may be too late to write Nand Flash data loss or upgrade mapping table is too late to write Nand Flash Mapping table is missing.

Abnormal power failure caused by the phenomenon.

SSD abnormal power failure, there are generally three types of failure phenomena:

1, SSD can not reproduce the system ID, the need to rebuild the mapping table or by a simple and crude way of reproduction can be used;

2, many times after a power outage, SSD there are many “new bad blocks”;

The mechanism behind the new bad block is that when the SSD unsuccessfully reads, writes or erases, it will be identified as a bad block. Of course, these blocks are not really bad, just because they are abnormal power failures caused by incorrect judgment.

3, SDRAM data loss;

Common power-off protection mechanism

Each power-off protection mechanism to understand different, different for the user, the protection mechanism is completely different, there will generally be the following two practices:

1, save all data in SDRAM

Abnormal power off, SDRAM all data must be written to Nand Flash completely, in general, SDRAM capacity is set to 1000% of the basic capacity amount of SSD, for small capacity SSD, SDRAM needs to write Nand Flash data Relatively small , through the supercapacitor or tantalum capacitor can continue to write data. However, if the capacity of the SSD is large enough, for example: 8TB, then the SDRAM needs to write data Nand Flash will be very large, if you still rely on the super capacitor or tantalum capacitor to do the power supply, You will inevitably face the following three tricky problems:

a, the need for more tantalum capacitor particles to do protection, in actual engineering practice, this is a very serious test, engineers face the thickness, standard size limit, PCB area is not enough to use;

b, even if there is enough capacitance to do the protection, when the “reset” is implemented, the SSD will not start properly, it must first power off for a while before restarting, because: SSD needs to put all the tantalum capacitor after power identified;

c, when the use of a few years after the tantalum capacitor or supercapacitor after aging, when the power supply of the tantalum capacitor can not reach the initial design target value, the user still has data loss after the loss of power or SSD can not identify the potential risks, if the initial design Ie make redundant capacitors, then it will return to problem “b” death cycle.

It is gratifying that the b and c problems are perfect solutions to solve these thorny problems that only need the mind and experience of the engineers.

2, only save SDRAM user data, without saving mapping table

This will reduce the use of SDRAM and the use of tantalum capacitors, “don’t save mapping table” doesn’t mean the mapping table is lost, just don’t save the last data write update map, when the SSD becomes to power on, Looking for the last mapping table to save the new data written to rebuild the mapping table, the drawbacks of this approach is not enough mechanism to set the reasonable, then rebuilding the mapping table will be longer, SSD takes some time for normal normal access

For controllers without SDRAM design, all data is written directly to Nand Flash. When data is lost, the data that is not written to Nand Flash will be returned to the host. If there is no need to save additional data, high reliability requirements of the application, no SDRAM design is king, its representative is a German industrial brand master, its only drawback is that the performance is not good enough, in fact, many applications and the need for the highest performance, and Is “sufficient” performance.

Test methods and principles

Specific test, SSD needs as the system disk and as the disk of the two test cases, so the main disk and disk test method is the only difference is that the main disk needs to test the computer to shut down the machine, and from the disk only SSD can be off.

a, respectively, of the SSD as a blank disk, the data is written at 25% and 50% when writing data, the write data for 85% and 3000, respectively, the abnormal shutdown 100% test write data, each down AND the power-on time interval of 3 seconds;

The principle of writing data of different capacity to the disk is: when the SSD writes a certain amount of data, the background starts to collect garbage, garbage collection means that data relocation, data migration means that the table of mapping is updated, at this time Abnormal power failure is usually a problem.

b, when write data normal, SSD shut down abnormally

c, when the data is deleted when the abnormal shutdown

In the windows, delete the data must also perform eight actions, and the establishment of the same document, the mapping table must also be updated.

d, when the SSD read file shutdown abnormally, please test 3000 times, shutdown time interval 3 seconds;

e, when the normal shutdown process abnormally shutdown, test 3000 times;

f, when the normal startup of the operating system shuts down abnormally, test 3000 times;