Enterprise drive with Industrial SSD reliability

Why Do Enterprise-Class SSDs Need Industrial-Grade Resilience?

SSDs2024-10-07

Designing-in solid state drives (SSDs) for enterprise infrastructures requires consideration of several factors, including data integrity, power loss protection, low latency, and advanced error correction capabilities on the product level. However, what truly sets enterprise-class SSDs apart is the extensive testing, validation, and resource investment in their design process, as well as the evidence demonstrating their suitability for integration into enterprise infrastructures in a range of challenging environments.

 

The trend of Information Technology (IT) and Operational Technology (OT) convergence in uncontrolled environments has further intensified the need for robust, industrial-grade storage solutions in enterprise infrastructures.

 

                                                                                                              

                                                                                        

Enterprise Class Meets Industrial Grade

 

The enterprise is undergoing constant transformation. Data and compute processes are being pushed to the edge, beyond the walls of the traditional data center, where SSDs may be installed in rough and remote environments facing temperature fluctuations, humidity, and even shocks/vibration.

This reflects the need to offer storage products that are not only enterprise-ready but also have the resilience and durability of industrial-grade SSD solutions.

To bridge the two worlds, ATP is proud and excited to deliver a new product class like no other: THE INDUSTRIAL ENTERPRISE SERIES SSDs.

                                                                                 

 

Building Blocks of Industrial Enterprise Storage

 

1.The Foundation: Prime ICs/KGD, RDT

                                                                                                                                 

The above acronyms form the very core of ATP’s industrial enterprise storage solutions — prime integrated circuits (ICs), known good die (KGD), and ATP’s Rapid Diagnostic Test (RDT).Beginning with prime ICs or KGDs component sourcing, these top quality ICs are then subjected to ATP’s own Rapid Diagnostic Test.

The primary objective of ATP RDT is to identify and screen out weak NAND flash blocks. This proprietary testing method screens 100% of every device, examining all areas of the drive.The comparison table below summarizes the advantages of ATP’s RDT.

 

  Connection Temperature Test Testing Area Test Criteria
Traditional Burn-In Test A host system is required,
limiting interface bandwidth
Room temperature Only the user area Data patterns setting
ATP NAND Screening RDT
(stricter testing criteria)
Controller directly tests the
NAND flash
Extreme temperatures from
-40C° to 85°C
All areas:
-Firmware
-User area
-Spare area
- Strict testing criteria setting
- Classify various grades (rating)
  1. Various ECC thresholds
  2. Later bad blocks
  3. Spare blocks

 

By employing this uncommon and resource-intensive process, ATP demonstrates its commitment to long-term reliability and endurance, prioritizing customer satisfaction over profit margins. This approach sets ATP apart from many memory module manufacturers who typically avoid such comprehensive testing procedures. To learn more about ATP RDT, watch the video below

 

 

2.Ensuring Industrial Enterprise SSD Robustness Through Validation and Testing

 

Enterprise SSDs must undergo exhaustive validation and testing to ensure that they can withstand the high-stress demands of mission-critical applications.

  • Media/NAND-Level Testing characterizes NAND behavior and reliability with NAND error-handling mechanisms.
    • Read Disturb Test - Tests to 500K reads of a logical block address (LBA) with 1000 random LBAs tested (500K times each). This is especially critical to read-intensive boot/OS drives.

                                                                                                          

                                                                          Read Disturb Test results show that after 12,000 program/erase (P/E) cycles, a single page can be read fewer than 6 million times before experiencing read disturb issues.

  • Write Amplification Index/Factor (WAI/WAF) Test – Subjects SSDs to multiple workloads to confirm the WAI/WAF and provide an accurate Sequential Terabytes Written (TBW) projection. WAI/WAF refers to the ratio of the total gigabytes written by the host to the NAND. A WAI value of 1 is ideal, as it means that 1 MB of data from the host is written as 1 MB NAND flash memory on the flash storage device. A high WAI/WAF rating increases writes on the drive, therefore reducing its lifespan. ATP Industrial Enterprise SSDs are subjected to multiple workloads to confirm the WAI/WAF and provide an accurate Sequential TBW projection. According to the JESD219A Enterprise Endurance Workloads, the workload shall comprise random data with the following payload size distribution:
Random Data Payload Size Distribution (JESD219A)
512 bytes (0.5K) 4%
1024 bytes (1K) 1%
1536 bytes (1.5K) 1%
2048 bytes (2K) 1%
2560 bytes (2.5K) 1%
3072 bytes (3K) 1%
3584 bytes (3.5K) 1%
4096 bytes (4K) 67%
8192 bytes (8K) 10%
16,384 bytes (16K) 7%
32,768 bytes (32K) 3%
65,536 bytes (64K) 3%

                                                                                                                 Source: JEDEC

  • Enhanced Performance Testing. Tests the SSD’s performance with various workloads, a mix of read and write operations, and sequential/random addressing.

                                                                                             

  • Enhanced Power Consumption Testing. Includes characterizing the power consumption of SSDs under different workloads, which is not limited to only the maximum and average measurements under a single worst-case scenario.
  • Thermal Characterization & Testing. Includes testing the SSD’s thermal throttling behavior at various ambient operating temperatures and with various airflow settings.
  • Power Cycle Testing. Validates the design of the power loss protection (PLP) mechanism under sudden power-off conditions.
  • Operating Temperature Cycle Test. Operating temperature and input voltage can impact SSD functionality. Four-corner testing validates the power cycling reliability and operational functionality of the SSD using a combination of different variants in four quadrants: high/low operating temperature and high/low input voltage.

 

3.Tailored Solutions: ATP’s Unique Capabilities in Building Industrial Enterprise SSDs

 

ATP is a true manufacturer with full control over its hardware and firmware development. This autonomy allows ATP to offer customization capabilities tailored to the specific needs of customers.

Under ATP’s WE BUILD WITH YOU program, the following enhanced Firmware Customization Services are available on a project basis to meet various enterprise customer needs in Server, Storage and Compute:

  • Performance Behavior Tuning. Performance behavior analysis and customization to optimize throughput and latency in a customer host application.
  • Power Loss Protection (PLP) Tuning
    • Optimized Flush Cache Timing. Ensures that the flush cache is completed within the capacitors’ hold-up time to ensure the integrity of data in flight and at rest.
    • PLP Capacitor Monitoring. Performed using the onboard microcontroller unit (MCU), includes regular capacitor health checks during SSD bootup and operation.
  • Thermal Management Customization. ATP’s adaptive thermal throttling solution is distinguished by the ability to adjust the temperature settings according to the customer’s application-specific requirements.
  • Enhanced Read Disturb Resilience. The ATP FW monitors and refreshes frequently read and read-only data by moving it to another block when the error criteria or threshold is met. ATP can modify the FW to enhance its ability to withstand Read Disturb events in specific Enterprise use cases, such as Boot-Up Scenarios.
  • SMART ID Customization. The firmware includes a range of Self-Monitoring, Analysis and Reporting Technology (SMART) ID attributes that can be customized based on customer requirements.
  • Download Microcode Capability. This service is part of flexible firmware maintenance, enabling Enterprise customers to rapidly make updates to their specific configurations via field updates, avoiding the hassle of sending SSDs back to ATP for reinitialization.

Conclusion: Why ATP? WE BUILD WITH YOU!

 

Top-tier companies look to ATP as they navigate the convergence of the enterprise and industrial worlds. With decades of manufacturing leadership in specialized memory and storage solutions, ATP Electronics is trusted as a strategic supplier by 70% of companies listed on Gartner’s Magic Quadrant report for Primary Storage, Data Center and Cloud Computing, and WAN-Edge Infrastructure. This speaks volumes about ATP’s ability to meet the most exhausting demands of the enterprise.

For more information on ATP’s Industrial Enterprise Series SSDs, please contact an ATP Representative or visit the ATP website.

 

Back to Blog
Contact Us