by lacouture » Thu May 24, 2012 4:02 pm
Hi,
xenoxaos wrote:
> Also, if your device is almost full, the only spots that it can wear level over is the empty space.
I'm doubtful about this.
To summarize, the question of wear leveling is not so much about how much unused space is left on your drive, but the ratio between "hot" blocks, that change very often, and "cold" blocks that never change.
For the curious, here's the explanation below. It's just my two cents, and although I work in a domain in which wear leveling is critical, I'm not an expert on it, so I may be wrong.
Wear leveling algorithms generally work on a flash block level, and thus are generally unaware of the filesystem on top of it. Therefore, they simply don't know where the empty spaces are.
Doing otherwise would require either:
- the filesystem to be able to indicate to the wear leveling algorithm (generally running in the device's controller) where are the used and unused blocks on the fly;
- or the device to reverse-engineer the filesystem operations to guess where these blocks are!!!
For this reason, as far as I know, flash devices use two techniques to do wear leveling:
- From the computer point of view, blocks have a fixed (virtual) address, independent from their physical position. So blocks that are updated very often can be "moved" around, while blocks that are written seldom will stay in place, without the computer knowing it. This is a tradeoff, where the higher stress of "hot" blocks is distributed to the less stressed blocks. Statistically, the "empty" blocks are indeed good candidates to get some of the stress of the hot blocks, but this mechanism also diverts this stress to blocks that are never updated, but still used (e.g. program files). Concretely, if a block that is updated 1000 times per second is exchanged in turn with 999 blocks that are never updated, in average you have 1000 blocks that are updated just twice every second (this is simplified). The beauty of it is that the algo does not need to know if the blocks that never change are unused or just static data, it just levels down the stress from a very few hot blocks to many more, colder blocks.
- Albeit the mechanism above, if/when a physical block dies, it is marked as discarded and replaced by a block from a set of spare blocks. You generally don't know how many spare blocks are available in your device, but when they are depleted, well, it's time to change the device.
Then, the bottom line is, when deciding how to partition your data over flash media:
- If possible, don't put hot files/swap on flash media.
- If you're forced to do so:
- Either put all your hot files in the same media, and all "cold" files on another. You know the hot media will die quickly, and will have to be replaced, but the cold one will last much much longer.
- Or mix cold and hot files on the same media, and depending on the distribution of hot and cold files, your media will last longer than the hot media described in the first option above, but less than the cold one. But when it dies, you know that you'll probably lose everything at once.
Partitioning won't necessary help separate hot areas from cold areas, because the wear leveling algorithms may even be unaware of partitions and will divert the stress from a "hot partition" to the "cold partition" that sits next to it.