Overclock.net - An Overclocking Community

Overclock.net - An Overclocking Community (https://www.overclock.net/forum/)
-   SSD (https://www.overclock.net/forum/355-ssd/)
-   -   Samsung 512gig 850 PRO about to die it seems, very low erase cycles (https://www.overclock.net/forum/355-ssd/1732162-samsung-512gig-850-pro-about-die-seems-very-low-erase-cycles.html)

chrcoluk 08-25-2019 10:59 PM

Samsung 512gig 850 PRO about to die it seems, very low erase cycles
 
Have a 4 year 8 month old SSD, PC its in is 24/7 on.

The type of workload on the PC for the SSD itself I do not consider unusual, basically it hosts OS files, but I have moved the documents, pictures, downloads folders onto a spindle.
Also temporary internet files (cache) are moved to ramdisk which would greatly reduce writes.
For most of this time windows search (indexing) has been disabled which reduces writes to index database (defaults to system drive).
I dont use hibernate.
Steam download cache folder is moved to a spindle.
Pagefile is enabled however.

There is also a games partition where I have about 300 gig of games installed, these are very light on writes tho, as the games dont often get refreshed at all.

This is important to explain so people can picture the load put on the SSD.

From what I can observe whenever I have monitored i/o activity there is a constant stream of small (probably 4k) writes to the ssd, usually with windows updating log files, and page file updates.

Also even tho temporary internet files have been moved, chrome is quite a write heavy browser, it has various internal databases that constantly get updated such as safebrowsing, history, autocomplete, session, and cookies. Observation again indicates mostly small but frequent i/o.

So when I post the stats here, the amount of data written shouldnt be too much of a surprise given the observations. This is with the steam download cache moved, and internet cache moved.

I do have 32 gig of ram, but windows memory management will start using the pagefile before been anywhere near close to 100% ram utilisation.

The internet cache before I moved it was generating about 2-5 gig a day in writes. It was moved when the SSD was about 2 months old.

So the usage data is.

20TB life time writes.
Just 28 erase cycles
99% estimated life left.
POH 2338 days
0 CRC, reallocated etc.

For comparison my oldest SSD which is currently in my laptop and was my main SSD before the 850 pro has these stats. This is a planar MLC 120 gig samsung 830 SSD. No symptoms of failure yet. Note the much higher erase cycles.

POH 2205 days
Erase cycles 438 (way more)
88% life left
9.11TB written
0 CRC etc.

Have a look at this site which explains about the system area of SSDs (this part of SSDs cannot perform wear levelling, its used for sector mapping and other tasks I believe this isnt accounted for in SMART data).

https://blog.elcomsoft.com/2019/01/w...-deal-with-it/

Scroll down to "Why SSD Drives Fail with no SMART Errors"

So the workload I have described kind of fits that case, constant small amounts of i/o rather than things like writing large iso files.

So what are the symptoms?

Well ironically the trigger event was windows search (indexing).

I have decided to use the file history feature to make backups of some files on my system, the way this works if it will by default backup library locations, so I did as documented added the folders I Wanted backed up to libraries, and selected a backup drive. There is one last requirement tho, and that is the locations have to be indexed, so windows search is a requirement. So of course I enabled windows search.

I was watching the i/o in task manager, and was lots of writes going to the 850 pro (with it been the system drive and default location for index is in programdata), and suddenly in task manager the drive went to 100% active state.

I watched and it stayed there for two minutes, I then tried to do things on the computer, as well as browse, chrome stopped responding as it was trying to update its databases, start menu didnt do anything and eventually nothing was working, I could move mouse pointer but thats it.

I hit the reset button on my PC and on post it waited for an eternity and eventually "no boot device found".

I went in the bios and every drive was present except the 850 pro.

Luckily on a power cycle the drive came back.

I have not had a hang in windows since, but doing a new system drive backup, reads were very slow. It was bottlenecking at about 30MB/sec. The spindle it was writing to was not breaking a sweat. If new data is written it can be read back quickly, but old stale data seems hard to read (remember that infamous 840 bug?).

Since the incident I have rebooted 3 times, 2 out of those 3 occasions I had no boot device found. Power cycle required to get it to boot.

Modern 3d nand drives most people claim these are going to last decades because of the claimed endurance ratings, but these drives have a weak point in the system area. However even with that in mind this is my first issue with a SSD I own.

I own 2 samsung 830s which both have had no issues.
This 850 pro which I think will die soon.
A few cheapo kingston small ssd's, they all planar MLC, not used 24/7, except one in my pfsense unit which is business grade mini sata.

Thoughts?

I cannot do a online RMA with samsung, today is a public holiday in the UK, so probably wont be able to start the process until tuesday, I am hoping they have advanced RMA, as obviously with it been my system drive it makes things awkrawd.

chrcoluk 08-26-2019 02:34 AM

Ok started the RMA process, its a 10 working day wait :( no advanced RMA :(

I think what I am going to try and do, which may play havoc with the install is import tha macrium image to a spare 120 gig kingston ssd I have as a temporary measure, so I can still use the PC, as 2 weeks is a long time, then do the same again when the new SSD arrives.

I know from past experience tho windows does some weird drive remapping when boot drive changes, and doing this twice I am not happy about.

Liranan 08-26-2019 09:31 PM

Good luck with the RMA, I hope it all gets sorted out soon.

chrcoluk 08-26-2019 10:37 PM

1 Attachment(s)
For the curious, I listed i/o activity in resource monitor, in writes order.

Check the amount of chrome hits, there is also a windows log constantly written to related to windows audio glitch detection.

sok0 08-27-2019 09:23 AM

The amount of time you've already wasted trying to fix/troubleshoot a 5 year old SSD is not worth it considering the price you pay for a brand new latest and greatest SSD.. Just dump it, and move on.

chrcoluk 08-28-2019 05:24 AM

The RMA process has already started ;) its not even in my system anymore.

However there is nothing wrong with looking for the reason it happened.

There is a very long bug report on chrome developers bug section regarding the huge amounts of writes the browser carries out and what was interesting is samsung specifically asked me if I use chrome. They didnt ask if I use any other piece of software, just that question.

Also finally your comment is a bit disrespectful to those who are poor, it reads as if buying a new SSD is easy and very affordable for everyone, also that if people lose data it doesnt mean anything.

It is kinda sad people talk as if 5 years old is very old and think it normal routine to dump things after only a couple of years in a throw away type of manner. Funny enough tho the latest and greatest SSD's are worse than the older one's, the SSD's put to market are a downgrade, samsung's QLC drive is their latest and greatest. That may well have died just after a year.

briank 08-28-2019 05:47 AM

That's interesting that the "system area" which may also be known as the FTL (Flash Translation Layer) has a fixed area of flash on a "Pro" drive. I guess it must be a cost savings measure to save on DRAM where a true enterprise class SSD would keep its FTL and only write to the system NAND area on power down. This type of FTL implementation is probably typical in most modern low cost SSDs though.

I think this limitation of the drives could make the case that good system design is to buy a smaller ~256GB boot SSD and not use your expensive 2TB primary storage SSD as a boot/OS drive. That way you can keep the large drive for a long time and just plan on upgrading your OS drive every 4-5 years.

rui-no-onna 08-28-2019 09:16 AM

Quote:

Originally Posted by briank (Post 28106214)
That's interesting that the "system area" which may also be known as the FTL (Flash Translation Layer) has a fixed area of flash on a "Pro" drive. I guess it must be a cost savings measure to save on DRAM where a true enterprise class SSD would keep its FTL and only write to the system NAND area on power down. This type of FTL implementation is probably typical in most modern low cost SSDs though.

Hmm, looks like it's still stored in DRAM (for performance reasons) but flushed more frequently to NAND due to lack of backup capacitors.

https://image-us.samsung.com/Samsung...-r1-JUL16J.pdf

Quote:

Due to the high cost of capacitors and the price sensitivity of the client SSD market, client SSDs do not normally include full power loss protection in the form of backup power circuitry. While the FTL is frequently flushed from the DRAM to the NAND, there is still a window of opportunity where the DRAM contains a newer version of the FTL in the event of an unexpected power loss. Fortunately, there are alternative and more cost efficient mechanisms to prevent the FTL from corrupting.

Quote:

Originally Posted by briank (Post 28106214)
I think this limitation of the drives could make the case that good system design is to buy a smaller ~256GB boot SSD and not use your expensive 2TB primary storage SSD as a boot/OS drive. That way you can keep the large drive for a long time and just plan on upgrading your OS drive every 4-5 years.

Unfortunately, using the same SSD for boot/OS and storage is sometimes unavoidable (e.g. laptops, NUCs). Although it's good that even ultra small form factor computers nowadays support M.2 + 2.5.

rui-no-onna 08-28-2019 09:46 AM

Quote:

Originally Posted by chrcoluk (Post 28106206)
It is kinda sad people talk as if 5 years old is very old and think it normal routine to dump things after only a couple of years in a throw away type of manner.

To be honest, I expect and experience higher failure rates on HDDs (usually around 3 years?) than SSDs. Also, iirc, NAND would lose charge after 10 years so they're really not for archival.

Quote:

Originally Posted by chrcoluk (Post 28106206)
Funny enough tho the latest and greatest SSD's are worse than the older one's, the SSD's put to market are a downgrade, samsung's QLC drive is their latest and greatest. That may well have died just after a year.

Unfortunate, but true. Every die shrink means lower resilience to leakage in terms of NAND itself. We just have better controllers to mitigate.

Mind, I actually did get the planar TLC 840 500GB and 840 EVO 1TB when those were released. I wouldn't consider buying those models at low capacities, though. Also, I actually would consider buying 3D QLC but only at 4TB+.

JackCY 08-28-2019 10:04 AM

840 EVO 250GB TLC
37.5 TBW
26.5k hours, over 1097 days
88% wear level

No issues so far. Everything from the system sits on it, so did games, only 1 game's logging/replay folder was relocated to HDD.

No idea where you see erase cycles.

Not everything is shiny on the expensive PRO models.


All times are GMT -7. The time now is 02:20 PM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.

User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.
vBulletin Security provided by vBSecurity (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.

vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.