[Celinux-dev] Application XIP on Linux 2.6.17

Paolo Giarrusso p.giarrusso at gmail.com
Fri Sep 1 09:59:53 PDT 2006


On 9/1/06, Alexey, Korolev <alexey.korolev at intel.com> wrote:
> Hi all,
>
> I need application XIP on Linux 2.6.17. But I faced some issues with
> application XIP on Linux 2.6.17.
> The problem is related to huge changes in Memory management code between
> 2.6.17 and 2.6.14.
> I tried to tune patch for Linux 2.6.14

http://tree.celinuxforum.org/CelfPubWiki/PatchArchive?action=AttachFile&do=get&target=cramfs-linear-xip-2.6.14.patch
>
> but patch tuning didn't help. Linux still drops to kernel panic.

You need first to understand VM changes by looking at mm/ git changelogs.

> I wonder has somebody else faced the same issues with kernel newer than
> 2.6.14?
>
> I'm a new person in memory managemnt code of Linux kernel. I would be
> very much appreciate if somebody explain me why we need this code to
> provide application XIP ?

I've looked for updated patches but found none, and many sites say
that current XIP support is just for kernel execution, but code says
differently. My impression is that this change is not needed in the
new kernel - any range mapped with remap_pfn_range will be
auto-handled this way (and CRAMFS only maps so XIP pages). However,
this code does not use the generic XIP code in 2.6.15+ (and maybe even
before) kernels, and changing this may be a good idea.

Then, what this code does? Suppose that the .data section of an
executable from cramfs is mapped - it will be mapped
PROT_READ|PROT_WRITE, MAP_PRIVATE, but the PTE will be read-only.
Suppose that a variable is written in this page - a protection fault
is generated and a new copy of the page must be allocated, so that the
file is not altered (MAP_PRIVATE) and the write succeeds.

However, it is possible that do_wp_page() when seeing that
vm_normal_page fails will still correctly handle the fault by calling
cow_user_page() - VM_PFNMAP should be set, but remap_pfn_range does
it.

Quite frankly, I'm surprised about the fact that EXT2 XIP support does
not have such code - if application XIP is not supported then that's
clearer, but comment are clear enough about it being supported

Some pointers about the original implementation are here:
http://lwn.net/Articles/135472/
http://search.gmane.org/?query=execute+in+place+support&email=Carsten+Otte&group=gmane.linux.kernel&sort=relevance&DEFAULTOP=and&query=

So, this code should go (if needed) in do_wp_page I guess (if it is
not there or in that code path I'm being mislead - I do not have
2.6.14 sources available right now) - currently the pfn_valid check
after which this patch is inserted is in vm_normal_page which is used
also elsewhere (IIRC) so the change must not go in vm_normal_page but
must be done if vm_normal_page fails. As vm_normal_page comment and
the below comment explain, vm_normal_page will fail because "the
source memory is not RAM so no struct page is associated with it.

Also note that the spinlocking has changed, the spinlock to use
possibly depends on the pte (right now this is true on big SMP boxes
IIRC). See pte_offset_map_lock and other such macros.

> +        if ((vma->vm_flags & VM_XIP) && pte_present(pte) &&
> +            pte_read(pte)) {
> +            /*
> +             * Handle COW of XIP memory.
> +             * Note that the source memory actually isn't a ram
> +             * page so no struct page is associated to the source
> +             * pte.
> +             */
> +            char *dst;
> +            int ret;
> +
> +            spin_unlock(&mm->page_table_lock);
> +            new_page = alloc_page(GFP_HIGHUSER);
> +            if (!new_page)
> +                return VM_FAULT_OOM;
> +
> +            /* copy XIP data to memory */

> +            dst = kmap_atomic(new_page, KM_USER0);
> +            ret = copy_from_user(dst, (void*)address, PAGE_SIZE);
> +            kunmap_atomic(dst, KM_USER0);

These 3 lines are now cow_user_page

The below code seems a patched copy of what is under this comment in do_wp_page:
        /*
         * Re-check the pte - we dropped the lock
         */

> +            /* make sure pte didn't change while we dropped the
> +               lock */
> +            spin_lock(&mm->page_table_lock);
> +            if (!ret && pte_same(*page_table, pte)) {
> +                ++mm->_rss;
> +                break_cow(vma, new_page, address, page_table);
> +                lru_cache_add(new_page);
> +                page_add_file_rmap(new_page);
> +                spin_unlock(&mm->page_table_lock);
> +                return VM_FAULT_MINOR;    /* Minor fault */
> +            }
> +
> +            /* pte changed: back off */
> +            spin_unlock(&mm->page_table_lock);
> +            page_cache_release(new_page);
> +            return ret ? VM_FAULT_OOM : VM_FAULT_MINOR;
> +        }
> +


More information about the Celinux-dev mailing list