Musings on the PLX kernel sources.

This forum is for all other ARMv5 devices

Musings on the PLX kernel sources.

Postby telzey » Mon Mar 12, 2012 6:17 pm

So we've now got 4 sets of kernel source code from shipping PLX-NAS7820/7821 (aka ox820) machines to look at.

From the release dates and the kernel comments, they form a timeline view of the PLX support patches for the linux kernel.

1) Cloud Engines Pogoplug V3 (kernel 2.6.31.6)
2) Iomega HMND Cloud Edition (kernel 2.6.31.14)
3) Silverstone DC01 (kernel 2.6.31.14)
4) Medion Life P89626 aka ZyXEL STG212 (kernel 2.6.31.14)

One thing that clearly comes across to me is that PLX is having a problem fixing a nasty bug relating to lost interrupts.

AFAIK, the ARM11 architecture doesn't synchronize I-cache and D-cache contents in hardware, they do it in software by invalidating the cache contents whenever they think that they need to.

This gets complicated when you've got a SoC doing DMA in the background and changing memory contents without telling the CPU.

This gets much, much more complicated on a dual-core processor like the ox820 where each core has different I-cache and D-cache contents.

PLX seem to be using FIQs to have one processor core tell the other that it needs to clear some/all of its cache.

There are a LOT of continual changes in the kernel code related to this system. There seems to be a problem with some of the FIQs getting lost and possibly resulting in cache coherency problems.

Those cache problems would lead to what ZyXEL quaintly calls "instability" in the comment where they disable the 2nd CPU core on the STG212.

So where are we at with relation to the kernel used by ALARM?

Well, the 2.6.31.6 kernel and it PLX patches are hopelessly out-of-date. There are a lot of fixes in the later kernel sources especially in relation to using the SATA port. You should definitely avoid formatting a partition as EXT4 with this kernel.

The Iomega HMNDCE has been shipping for a while and seems to be stable. It doesn't include all the latest fixes or support the newer devices that the DC01 or STG212 kernel code do, but then again, neither the Pogoplug nor Iomega actually use those devices.

The Silverstone DC01 kernel code is at an interesting midway point between the Iomage and the ZyXEL kernel code.

They have removed some of the safety code that is in both the Iomega and ZyXEL kernels, but have added at least 1 new different safety check that is in neither of the others.

The DC01 itself is new enough that we really don't know about it's stability.

The ZyXEL kernel code has to be looked at with a little wariness. While they actually shipped with most of the safety checks disabled, they also shipped their device with one of the two CPU cores also disabled.

On the other hand the ZyXEL kernel code also includes a LOT of new changes related to attempts to fix the problem of lost FIQs, and so any newer code that we see from PLX will probably look more like the ZyXEL kernel than any of the older kernels.
telzey
 
Posts: 58
Joined: Fri Dec 16, 2011 8:42 pm

Re: Musings on the PLX kernel sources.

Postby WarheadsSE » Mon Mar 12, 2012 7:43 pm

Well, let me tell you.. there hasn't been much from PLX.

However, ftcode's expertise has help a lot in the restructure of the 3.1+ kernel, and while that drivers are getting better, the SoC having this issue explains the WTF lockup that only happens under high memory pressure at full-bore CPU usage.
Core Developer
Remember: Arch Linux ARM is entirely community donation supported!
WarheadsSE
Developer
 
Posts: 6807
Joined: Mon Oct 18, 2010 2:12 pm

Re: Musings on the PLX kernel sources.

Postby telzey » Mon Mar 12, 2012 8:06 pm

I'd be quite surprised if PLX weren't the ones putting in the new code to fix this. It looks WAY too SoC-specific to be worth the while of their customers fixing.

It really looks to me like PLX are trying to set up a turnkey NAS solution for new-to-the-sector customers like Silverstone. For that, they've got to provide a stable working solution.

It's only long-term players like Iomega or Cloud Engines with their own existing customer interfaces that don't need the extra hand-holding that the PLX SDK seems to offer.

Since you're not actually a buying customer, you may not be on the fast-track to get all their latest code ... or I could just be wrong since I'm only working from the public sources of information.

The "WTF lockup" would be exactly the kind of problem that you'd see when things go bad! ;)
telzey
 
Posts: 58
Joined: Fri Dec 16, 2011 8:42 pm

Re: Musings on the PLX kernel sources.

Postby WarheadsSE » Mon Mar 12, 2012 8:52 pm

Yeah.. with NDA access, the official SDK files I can access haven't changed since June.
Core Developer
Remember: Arch Linux ARM is entirely community donation supported!
WarheadsSE
Developer
 
Posts: 6807
Joined: Mon Oct 18, 2010 2:12 pm

Re: Musings on the PLX kernel sources.

Postby telzey » Mon Mar 12, 2012 10:53 pm

That sucks! It sounds about right for the timeline of the Iomega and Silverstone patches.

The ZyXEL code looks newer, but maybe I'm wrong and they hacked it to death themselves trying to fix their "instability" ;)
telzey
 
Posts: 58
Joined: Fri Dec 16, 2011 8:42 pm

Re: Musings on the PLX kernel sources.

Postby telzey » Tue Mar 13, 2012 5:33 pm

Ooooh ... looks like there's more information and an explanation of the core problem in ARM Application Note 228 "Implementing DMA on ARM SMP Systems".

http://infocenter.arm.com/help/index.js ... index.html

It looks like my description of the problem at the start of this thread was probably wrong in the exact details, but correct in basic problem.

Reading it gives me a better understanding of what the changes in the ZyXEL source are trying to do.

I wonder why they didn't actually turn them on in the shipping Medion product?
telzey
 
Posts: 58
Joined: Fri Dec 16, 2011 8:42 pm

Re: Musings on the PLX kernel sources.

Postby FileDescriptor » Sat Apr 07, 2012 7:12 pm

I just came across this thread regarding the high memory usage lockup problem and found a similar thread discussing DMA and cache coherence issues, see http://mkl-note.blogspot.de/2009/12/linux-arm11-mpcore-smp-cache-issue.html. Somebody there posted
$this->bbcode_second_pass_code('', 'diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index be56c43..15dafb6 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -584,15 +584,15 @@ static void dma_cache_maint_contiguous(struct page *page,

switch (direction) {
case DMA_FROM_DEVICE: /* invalidate only */
- inner_op = dmac_inv_range;
+ inner_op = smp_dma_inv_range;
outer_op = outer_inv_range;
break;
case DMA_TO_DEVICE: /* writeback only */
- inner_op = dmac_clean_range;
+ inner_op = smp_dma_clean_range;
outer_op = outer_clean_range;
break;
case DMA_BIDIRECTIONAL: /* writeback and invalidate */
- inner_op = dmac_flush_range;
+ inner_op = smp_dma_flush_range;
outer_op = outer_flush_range;
break;
default:
') there and it sounds to be a fix for the issue and I just wanted to make you aware of that post, not sure if it could be helpful at all.

But I guess the main problem is way more complicated than it could be fixed by those few lines of patch, couldn't it?

FileDescriptor
FileDescriptor
 
Posts: 4
Joined: Fri Apr 06, 2012 10:03 am

Re: Musings on the PLX kernel sources.

Postby WarheadsSE » Sat Apr 07, 2012 11:36 pm

For 2.6.31, I'd say that might be useful. As for the 3.1 (see https://github.com/WarheadsSE/OX820-3.1 ... -mapping.c) it appears to be non-applicable.

Gratis for pointing us to it though.
Core Developer
Remember: Arch Linux ARM is entirely community donation supported!
WarheadsSE
Developer
 
Posts: 6807
Joined: Mon Oct 18, 2010 2:12 pm

Re: Musings on the PLX kernel sources.

Postby telzey » Tue Apr 10, 2012 4:36 pm

Yeah, thank for finding that FileDescriptor! :D

I'm really busy at the moment, but when I have some free time I'll take a look at those fixes that were posted.

In the meantime, the patched 2.6.31.14 kernel that I'm using seems to be completely stable under the same load that kills the 3.1 WIP kernel ;)
telzey
 
Posts: 58
Joined: Fri Dec 16, 2011 8:42 pm

Re: Musings on the PLX kernel sources.

Postby WarheadsSE » Wed Apr 11, 2012 1:29 pm

Telzey,

Is that including the patch from FileDescriptor? The last time I tried it, I got a hang. If your patch set is out of date, let me know, I'd like to try again.


As for the release everyone is asking about, I have 2.6.13.6 built with as many options as possible, with some patching having to be done to get a load of crap working (aren't kernel bugs FUN!!!), it needs tested.
Core Developer
Remember: Arch Linux ARM is entirely community donation supported!
WarheadsSE
Developer
 
Posts: 6807
Joined: Mon Oct 18, 2010 2:12 pm

Next

Return to Community Supported

Who is online

Users browsing this forum: No registered users and 56 guests