2012年12月6日 星期四

Performance of Unaligned Memory Access in Raspberry Pi

SoC of Raspberry pi is a Broadcom BCM2835. This contains an ARM1176JZFS, ARMv6 architecture.

ARMv6 adds unaligned word(4 bytes) and halfword load and store data access support.
For detail, please check.
I would like to test performance when a user space process invokes a unaligned memory access (only load).
Here is my test environment and cases.

[Environment]

Kernel: Linux xbian 3.6.1 #4 PREEMPT Thu Nov 8 18:54:20 CET 2012 armv6l GNU/Linux
GCC: gcc version 4.6.3 (Debian 4.6.3-12+rpi1)

# cat /proc/cpuinfo
Processor       : ARMv6-compatible processor rev 7
                  (v6l)
BogoMIPS        : 697.95
Features        : swp half thumb fastmult vfp edsp

                  java tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xb76
CPU revision    : 7

Hardware        : BCM2708
Revision        : 0002
Serial          : 0000000009b752ff


[Program]

unaligned.c
#include <stdio.h>
#include <stdint.h>

#ifdef USE_GCC_FIXUP
struct __una_u32 {
    uint32_t x;  
} __attribute__((packed));

static inline uint32_t get_unaligned_32(const void *p)
{
    const struct __una_u32 *ptr = 
        (const struct __una_u32 *)p;
    return ptr->x;
}
#endif

int main()
{
    uint8_t buf[16];
    uint32_t i, j;

    for (i = 0; i < sizeof(buf); i++)
        buf[i] = i;

    for (j = 0; j < 100000000; j++) {
        /* unaligned access */
#ifndef USE_GCC_FIXUP
        i = *(unsigned int*)(&buf[1]);
#else
        i = get_unaligned_32(&buf[1]);
#endif
    }

    printf("0x%X\n", i);

    return 0;
}

[Case]

  1. unaligned word access
    • fix up by hardware
      # gcc -c unaligned.c
      # time ./unaligned
      0x4030201

      real    0m2.053s
      user    0m2.040s
      sys     0m0.010s

    • fix up by software (gcc)
      # gcc -DUSE_GCC_FIXUP -o unaligned_gcc unaligned.c
      # time ./unaligned_gcc
      0x4030201

      real    0m6.384s
      user    0m6.370s
      sys     0m0.010s

    The result is very clear.

    Here I add a case is aligned access.
    Only modify:
    #ifndef USE_GCC_FIXUP
            i = *(unsigned int*)(&buf[0]);
    #else
    
    # time ./aligned
    0x3020100

    real    0m1.934s
    user    0m1.920s
    sys     0m0.000s

  2. unaligned double words access
    Modify unaligned.c to support double words access
    • fix up by kernel
      # time ./unaligned64
      0x807060504030201

      real    1m8.754s
      user    0m7.800s
      sys     1m0.830s


    • fix up by software (gcc)
      # time ./unaligned64_gcc
      0x807060504030201

      real    0m9.753s
      user    0m9.700s
      sys     0m0.030s

    Also add a case for aligned access.
    # time ./aligned64
    0x706050403020100

    real    0m2.413s
    user    0m2.400s
    sys     0m0.000s
So, to avoid unaligned memory access if possible!

2012年12月3日 星期一

Note about Linux Kernel char Device Driver

There are two ways to create a struct cdev:
  1. static way
    • declare
      • embedded in your device structure statically
        struct dummy_device {
            struct cdev cdev;
        };
    • initialize
      •  use cdev_init() function to initialize the cdev structure
        struct dummy_device dev;
        ...
        cdev_init(&dev.cdev, &dummy_fops);
    • remove
      • call cdev_del()
  2. dynamic way
    • declare
      • memory of cdev structure is allocated dynamically
        struct dummy_device {
            struct cdev *cdev;
        };
        ...
        struct dymmy_device dev;
        dev->cdev = cdev_alloc();
    • initialize
      • dev->cdev.fops = &dummy_fops;
    • remove
      • call cdev_del()
      • You don't need to call kfree() to free memory. Kernel releases the memory automatically. (talk this later)  
There is one thing you need to know about the dynamic way. 
If you use dynamic way to create a cdev structure, you should better not use cdev_init() to initialize the structure.

Let's check source code.

void cdev_init(struct cdev *cdev, 
               const struct file_operations *fops)
{
    memset(cdev, 0, sizeof *cdev);
    INIT_LIST_HEAD(&cdev->list);
    kobject_init(&cdev->kobj, &ktype_cdev_default);

    cdev->ops = fops;
}

struct cdev *cdev_alloc(void)
{

    /* cdev is allocated here! */
    struct cdev *p = kzalloc(sizeof(struct cdev),
                     GFP_KERNEL);
    if (p) {
        INIT_LIST_HEAD(&p->list);
        kobject_init(&p->kobj, &ktype_cdev_dynamic);
    }
    return p;
}

If you use dynamic way to create the cdev structure, ktype_cdev_dynamic() is used to release cdev kobject.
If you use cdev_init() to initialize cdev structure that is created in dynamic way, release function of kobject is replaced with ktype_cdev_default().
In this case, after you invoke cdev_del() to remove a cdev device from kernel, this causes a memory leak.
You need to free memory manually and it is not recommended.

static void cdev_default_release(struct kobject *kobj)
{
    struct cdev *p = container_of(kobj, struct cdev, kobj);
    cdev_purge(p);
}

static void cdev_dynamic_release(struct kobject *kobj)
{
    struct cdev *p = container_of(kobj, struct cdev, kobj);
    cdev_purge(p);
    kfree(p);    /* cdev is freed here! */
}

static struct kobj_type ktype_cdev_default = {
    .release        = cdev_default_release,
};

static struct kobj_type ktype_cdev_dynamic = {
    .release        = cdev_dynamic_release,
};

2012年11月30日 星期五

A Case to Solve Memroy Leak bug


近期Imgsrc一处内存泄露问题的查找和解决

This is a very interesting case to resolve a memory leak issue in ImageMagick library.

2012年11月23日 星期五

How "mtdparts=" Kernel Command Line Work in Nexus One

I am just wondering how it works.
And...source code is your answers.

I clone kernel source from cm-kernel
https://github.com/CyanogenMod/cm-kernel.git
branch: android-msm-2.6.37

  1. arch/arm/mach-msm/nand_partitions.c:
    Extract partition information from ATAG setup by boot loader.

    static int __init parse_tag_msm_partition(...)

    {
        ....
        msm_nand_data.nr_parts = count;
        msm_nand_data.parts = msm_nand_partitions;

        return 0;
    }


    This puts partition information into flash_platform_data

  2. drivers/mtd/devices/msm_nand.c  (msm nand flash controller driver)

    // request to use cmdlinepart parser
    static const char *part_probes[] = { "cmdlinepart", NULL,  };
    ...
    static int __devinit msm_nand_probe(struct platform_device *pdev)
    {
        ...
    #ifdef CONFIG_MTD_PARTITIONS
        err = parse_mtd_partitions(&info->mtd, part_probes,    
                                   &info->parts, 0);

        // if kernel command line has partition information, use it!
        if (err > 0)
            add_mtd_partitions(&info->mtd, info->parts, err);
        else if (err <= 0 && pdata && pdata->parts) {
            // else use partition information from boot loader
            for (i = 0; i < pdata->nr_parts; ++i) {
                pdata->parts[i].offset *= info->mtd.erasesize;
                pdata->parts[i].size *= info->mtd.erasesize;
            }
            add_mtd_partitions(&info->mtd,
                               pdata->parts, pdata->nr_parts);
        } else
    #endif
            err = add_mtd_device(&info->mtd);
    }

2012年7月30日 星期一

Manage Partitions of the Disk Image with loop device

loop device is a pseudo-device that makes a file accessible as a block device.
We can use loop device to manage partitions of the disk image.

Make sure you loop device driver is configured to support loop device partitions.
In my FC16 box, I manually add kernel command:
loop.max_part=63
Here is an example when I attach a disk image with losetup.

$ cat /proc/partitions
major minor  #blocks  name

   8        0   33554432 sda
   8        1     512000 sda1
   8        2   33041408 sda2
  11        0    1048575 sr0
   8       16  104857600 sdb
   8       17  104856576 sdb1
 253        0    4128768 dm-0
 253        1   28901376 dm-1

$ losetup -f disk_image
$ cat /proc/partitions
major minor  #blocks  name

   7        0    1048576 loop0
   7        1      10240 loop0p1
   7        2    1037312 loop0p2

   8        0   33554432 sda
   8        1     512000 sda1
   8        2   33041408 sda2
  11        0    1048575 sr0
   8       16  104857600 sdb
   8       17  104856576 sdb1
 253        0    4128768 dm-0
 253        1   28901376 dm-1

$ losetup -d /dev/loop0


2012年7月26日 星期四

Raspberry Pi is Arrived!!

Wait for a long time and finally arrived!! Raspberry Pi !!

Box (cute)
Raspberry pi is a credit-card sized computer.
Running Raspbian “wheezy” (2012-07-15)
My USB keyboard/mouse works in Raspberry pi.
I think I have to buy a USB Wifi dongle and USB hub.

Todo List
  • As a media center
    • Watch DVB (I have a unused DVB USB dongle)
    • Play DVD (attach a USB DVD-ROM)
    • Play video/audio file
  • As a download server
    • Attach a USB HDD
    • Run p2p clients
  • Create a customized image
    • boot, kernel, ... ,etc.
    • HW optimization
  • Run Android
    •  RAM is too small?
  • More....
[Ref]








2012年7月18日 星期三

Android: modify partition layouts

Storage is never enough!
Fortunately, we can pass partition layouts via kernel command line without kernel modification.
lbcoder, Thanks for post this! Custom partition layouts, ZERO brick risk!

My Nexus One default partition layouts:

# cat /proc/mtd
dev:    size   erasesize  name          size
mtd0: 000e0000 00020000 "misc"          896K
mtd1: 00400000 00020000 "recovery"      4M
mtd2: 00380000 00020000 "boot"          3M
mtd3: 09100000 00020000 "system"        145M
mtd4: 05f00000 00020000 "cache"         95M
mtd5: 0c440000 00020000 "userdata"      196M

# msm_nand MTD info during kernel booting
[   13.353302] msm_nand: DEV_CMD1: f00f3000
[   13.357299] msm_nand: NAND_EBI2_ECC_BUF_CFG: 1ff
[   13.361724] Creating 6 MTD partitions on "msm_nand":
[   13.366790] 0x000003ee0000-0x000003fc0000 : "misc"
[   13.373260] 0x000004240000-0x000004640000 : "recovery"
[   13.381561] 0x000004640000-0x0000049c0000 : "boot"
[   13.386108] 0x0000049c0000-0x00000dac0000 : "system"
[   13.538635] 0x00000dac0000-0x0000139c0000 : "cache"
[   13.637023] 0x0000139c0000-0x00001fe00000 : "userdata"
 
cache partition is too large to me.
I would like to change partition layouts to

dev:    size   erasesize  name          size
mtd0: 000e0000 00020000 "misc"          896K
mtd1: 00400000 00020000 "recovery"      4M
mtd2: 00380000 00020000 "boot"          3M
mtd3: 09100000 00020000 "system"        145M
mtd4: 02800000 00020000 "cache"         95M->40M
mtd5: 0FB00000 00020000 "userdata"      196M->251M

userdata partition up to 251M!

So, I need add kernel command line like below
mtdparts=msm_nand:896k@0x3ee0000(misc),4M@0x4240000(recovery),3M@0x4640000(boot),145M@0x49c0000(system),40M@0xdac0000(cache),251M@0x102C0000(userdata)
But how?
I do it manually. We need to modify both recovery and boot images.
  1. Boot to recovery mode
  2. Extract recovery image from recovery partition and use hex editor to add kernel command line into recovery image
  3. Flash back modified recovery image then boot to recovery again
  4. Check MTD information, we are done.
    [    9.726074] msm_nand: DEV_CMD1: f00f3000
    [    9.726257] msm_nand: NAND_EBI2_ECC_BUF_CFG: 1ff
    [    9.726440] 6 cmdlinepart partitions found on MTD device msm_nand
    [    9.726684] Creating 6 MTD partitions on "msm_nand":
    [    9.726837] 0x000003ee0000-0x000003fc0000 : "misc"
    [    9.728668] 0x000004240000-0x000004640000 : "recovery"
    [    9.733581] 0x000004640000-0x000004940000 : "boot"
    [    9.737548] 0x0000049c0000-0x00000dac0000 : "system"
    [    9.893493] 0x00000dac0000-0x0000102c0000 : "cache"
    [    9.937652] 0x0000102c0000-0x00001fdc0000 : "userdata"
     
  5. Flash zip from SD (I use CM7.2)
  6. Extract boot image from boot partition and use hex editor to add kernel command line into boot image
  7. Flash back modified boot image then reboot
  8. Boot to CM, then check MTD information.