分享到plurk 分享到twitter 分享到facebook

版本 3068d3957f8a3a9d12a5765cdea7c17252a01e0c

embedded/xenomai

Changes from 3068d3957f8a3a9d12a5765cdea7c17252a01e0c to eae8aba6a5f0c7dec5f2dee422ef9e4350a8d806

---
title: Xenomai
categories: embedded, arm, raspberrypi
...


建立環境
==============================
* 下載 Raspbian http://www.raspberrypi.org/downloads/

* Install Cross complier

.. code-block:: c

    cd <working dir>
    wget https://github.com/raspberrypi/tools/archive/master.tar.gz
    tar xzfv master.tar.gz

* Download kernel
 
.. code-block:: c

   git clone -b rpi-3.8.y --depth 1 git://github.com/raspberrypi/linux.git linux-rpi-3.8.y

* Download Xenomai 

.. code-block:: c

   git clone git://git.xenomai.org/xenomai-head.git xenomai-head

* Download minimal config 

.. code-block:: c

   wget https://www.dropbox.com/s/dcju74md5sz45at/rpi_xenomai_config

* Apply patches

  - Apply ipipe core pre-patch

  .. code-block:: c

    cd linux-rpi-3.8.y
    patch -Np1 < ../xenomai-head/ksrc/arch/arm/patches/raspberry/ipipe-core-3.8.13-raspberry-pre-2.patch

  - Apply Xenomai ipipe core patch 

  .. code-block:: c

    cd <working dir>
    xenomai-head/scripts/prepare-kernel.sh --arch=arm --linux=linux-rpi-3.8.y --adeos=xenomai-head/ksrc/arch/arm/patches/ipipe-core-3.8.13-arm-3.patch

  - Apply ipipe core post-patch 

  .. code-block:: c

    cd linux-rpi-3.8.y
    patch -Np1 < ../xenomai-head/ksrc/arch/arm/patches/raspberry/ipipe-core-3.8.13-raspberry-post-2.patch

* Compile kernel

  - Create build directory 

  .. code-block:: c

    mkdir linux-rpi-3.8.y/build

  - Configure kernel

  .. code-block:: c

    cp rpi_xenomai_config linux-rpi-3.8.y/build/.config
    cd linux-rpi-3.8.y
    make mrproper
    make ARCH=arm O=build oldconfig

  - Compile 

  .. code-block:: c

    make ARCH=arm O=build CROSS_COMPILE=/home/$USER/workspace/tools-master/arm-bcm2708/arm-bcm2708hardfp-linux-gnueabi/bin/arm-bcm2708hardfp-linux-gnueabi-
   
  - Install modules

  .. code-block:: c

    make ARCH=arm O=build INSTALL_MOD_PATH=dist modules_install

  - Install headers

  .. code-block:: c

    make ARCH=arm O=build INSTALL_HDR_PATH=dist headers_install
    find build/dist/include \( -name .install -o -name ..install.cmd \) -delete

* 編譯好的kernelImage,移到SD卡的 ``/boot/`` 路徑下並更改名稱為kernel.img
* 將``linux-rpi-3.8.y/build/dist``中的Module,移到SD卡中的``/lib/modules``
* Cyclictest
  - Linux

  .. code-block:: c
    

  - Xenomai

  .. code-block:: c
    
    cd xenomai-head
    export PATH=../tools-master/arm-bcm2708/arm-bcm2708hardfp-linux-gnueabi/bin/:$PATH
    ./configure --host=arm-bcm2708hardfp-linux-gnueabi
    cd src
    mkdir dist
    make install DIST_DIR=dist

  dist中會出現``usr/xenomai``
  將這個資料夾移到sd卡中 ``/usr/``

  在raspberry pi中

  .. code-block:: c

    export PATH=/usr/xenomai/bin:$PATH  
    export LD_LIBRARY_PATH=/usr/xenomai/lib
    sudo modprobe xeno_posix

   接著就能跑使用xenomai機制的cyclictest

  - RT_preempt


Real Time 的定義
==============================

* Hard Real Time

 系統一定可以在 Response Time 內完成指定的task

* Soft Real Time

 在特定的機率下,系統可以在 Response Time 內完成指定的task

作業系統架構
===========

.. image:: /embedded/xenomai/xenomai_arch.jpg

Xenomai是一個linux kernel的patch
藉由在底層增加一個架構
負責硬體與接收interrupt 並將interrupt 傳給上層的OS(這邊稱為domain)
 
這個底層的架構是Adeos 是另一個open source的project


在api呼叫上可以看到不同層級的抽象化

ipipe_XXX -> rthal_XXX -> xnXXX

負責傳送interrupt的程式稱為ipipe
示意圖
http://www.xenomai.org/documentation/xenomai-2.6/html/pictures/life-with-adeos-img4.jpg

.. image:: /embedded/xenomai/adeos.jpg

可以找到ipipe_raise_irq()將interrupt推到pipeline

在ipipe上每個domain都有自己的優先度
高優先度的domain會先接收到interrupt
高優先度的domain的thread 可以preempt低優先度domain的thread


iPipe
++++++++++++++

主要負責處理irq 與 timer(HRT), ipipe的工作很簡單 就是設定timer並將interrupt往上丟

* 相關檔案︰

  - gic.c : 
       
      Generic Interrupt Controller, Interrupt prioritization and distribution to each CPU interface. This is known as the Distributor. Priority masking and preemption handling for each CPU. This is known as the CPU Interface.

  - it8152.c:IRQ相關

  - timer-sp.c:dual timer module(sp804)

  - vic.c:

     The VIC provides a software interface to the interrupt system. In a system with an interrupt controller, software must determine the source that is requesting service and where its service routine is loaded. A VIC does both of these in hardware.

     功能為提供一個programable的介面讓使用者設定

  - ipipe-tsc.c:設定精準度(刻度)

  - ipipe/compat.c:interrupt

  - sched/clock.c:取得cpu_clock 解析度為nanosecond,開機後從0開始上數

.. image:: /embedded/xenomai/cpu_distribute.jpg

GIC大約是上圖的distributor的位置

VIC則是CPU interface的位置

但raspberry pi只有一顆CPU所以不會有SMP與 CPU affinity設定的問題


HAL
++++++++++++

Hardware Abstract Layer:process  透過HAL呼叫ipipe的服務。這一層主要是包裝ipipe 與底層資訊 讓nucleus可以不用看到硬體資訊。


Nucleus
++++++++++++

Xenomai的kernel, 包含一個scheduler,優先執行real-time tasks.

Scheduler
++++++++++++

優先處理realtime task ,linux也被視為其中一個thread,本身也有scheduler,但須等到沒有real-time task時(idle state),才會執行linux thread

.. image:: /embedded/xenomai/xenomai_sched.jpg

Skins
++++++++++++

呼叫xenomai的界面, 有native rtdm posix等。

問題
++++++++++++

與 RT-PREEMPT 途徑的差異?

* RT-PREEMPT 機制
  - Preemptible critical sections
  - Preemptible interrupt handlers
  - Preemptible "interrupt disable" code sequences
  - Priority inheritance for in-kernel spinlocks and semaphores
  - Deferred operations
  - Latency-reduction measures

  原本無法preempt的地方讓他可以preemt,讓spinlock 區塊在區分成可以preempt的地方跟不能preempt的地方,將IRQ handler移到thread中執行。

  Priority inheritance 是讓握有spinlock 或 semaphore的process可以暫時的提高優先權 讓他可以盡快做完critical section釋放spinlock或semaphore

  高Priority的 process才有辦法繼續執行

* RT_PREEMPT 與 xenomai的差異

  RT_PREEMPT是基於linux架構去改進 讓更多地方能preempt 達到real-time的能力

  Xenomai則是改變整個系統架構 新增一個scheduler與IRQ管理的機制

  讓處理real-time task流程簡化到只剩ipipe->scheduler 就能執行

  不會因linux龐大的架構影響到real-time task的處理時間

實作
==================


Context switch
++++++++++++++

.. code-block:: prettyprint

    include/arch/arm-asm/bits/pod.h
    static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, xnarchtcb_t *in_tcb)
    {
            struct task_struct *prev = out_tcb->active_task;
            struct mm_struct *prev_mm = out_tcb->active_mm;
            struct task_struct *next = in_tcb->user_task;

            if (likely(next != NULL)) {
                    in_tcb->active_task = next;
                    in_tcb->active_mm = in_tcb->mm;
                    rthal_clear_foreign_stack(&rthal_domain);
            } else {
                    in_tcb->active_task = prev;
                    in_tcb->active_mm = prev_mm;
                    rthal_set_foreign_stack(&rthal_domain);
            }

            if (prev_mm != in_tcb->active_mm) {
                    /* Switch to new user-space thread? */
                    if (in_tcb->active_mm)
                            wrap_switch_mm(prev_mm, in_tcb->active_mm, next);
                    if (!next->mm)
                            enter_lazy_tlb(prev_mm, next);
            }
            /* Kernel-to-kernel context switch. */
            rthal_thread_switch(prev, out_tcb->tip, in_tcb->tip);
    }

.. code-block:: prettyprint

    ksrc/arch/arm/switch.S
    /*
    /*
     * Switch context routine.
     *
     * Registers according to the ARM procedure call standard:
     *   Reg    Description
     *   r0-r3  argument/scratch registers
     *   r4-r9  variable register
     *   r10=sl stack limit/variable register
     *   r11=fp frame pointer/variable register
     *   r12=ip intra-procedure-call scratch register
     *   r13=sp stack pointer (auto preserved)
     *   r14=lr link register
     *   r15=pc program counter (auto preserved)
     *
     * Copied from __switch_to, arch/arm/kernel/entry-armv.S.
     * Right now it is identical, but who knows what the
     * future reserves us...
     *
     * XXX: All the following config options are NOT tested:
     *      CONFIG_IWMMXT
     *
     *  Calling args:
     * r0 = previous task_struct, r1 = previous thread_info, r2 = next thread_info
     */
    ENTRY(rthal_thread_switch)
            add     ip, r1, #TI_CPU_SAVE
     ARM(        stmia        ip!, {r4 - sl, fp, sp, lr} )        @ Store most regs on stack
     THUMB(        stmia        ip!, {r4 - sl, fp}           )        @ Store most regs on stack
     THUMB(        str        sp, [ip], #4                   )
     THUMB(        str        lr, [ip], #4                   )
            load_tls r2, r4, r5
    #ifdef USE_DOMAINS
        ldr     r6, [r2, #TI_CPU_DOMAIN]
    #endif
            clear_exclusive_monitor
            switch_tls r1, r4, r5, r3, r7
    #ifdef USE_DOMAINS
            mcr     p15, 0, r6, c3, c0, 0           @ Set domain register
    #endif
            fpu_switch r4
     ARM(        add        r4, r2, #TI_CPU_SAVE           )
     ARM(        ldmia        r4, {r4 - sl, fp, sp, pc}  )        @ Load all regs saved previously
     THUMB(        add        ip, r2, #TI_CPU_SAVE           )
     THUMB(        ldmia        ip!, {r4 - sl, fp}           )        @ Load all regs saved previously
     THUMB(        ldr        sp, [ip], #4                   )
     THUMB(        ldr        pc, [ip]                   )
    ENDPROC(rthal_thread_switch)




Scheduler
++++++++++

.. code-block:: prettyprint

     void __xnpod_schedule(struct xnsched *sched)
     {
            int zombie, switched, need_resched, shadow;
            struct xnthread *prev, *next, *curr;
            spl_t s;
            if (xnarch_escalate())    /*這function會看現在是不是在root (linux) domain 如果是的話代表沒有realtime task 就將interrupt丟上去*/
                    return;
            trace_mark(xn_nucleus, sched, MARK_NOARGS);
            xnlock_get_irqsave(&nklock, s);    /*在schudule時如果有interrupt會導致排程亂掉 因此要取得lock
                                                *這邊這個是spinlock 會利用test and set的指令方式確保取得lock的操作是atomic 
                                                *後面的irqsave代表他會將interrupt disable*/
            curr = sched->curr;
            xnarch_trace_pid(xnthread_user_task(curr) ?
                             xnarch_user_pid(xnthread_archtcb(curr)) : -1,
                             xnthread_current_priority(curr));
            reschedule:
            switched = 0;
            need_resched = __xnpod_test_resched(sched);
            #if !XENO_DEBUG(NUCLEUS)
            if (!need_resched)
                    goto signal_unlock_and_exit;
            #endif /* !XENO_DEBUG(NUCLEUS) */
            zombie = xnthread_test_state(curr, XNZOMBIE);
            next = xnsched_pick_next(sched);

            /*先檢測thread是否為zombie thread 接著看current thread有沒有取得schedule lock如果有就無法preempt然後回傳current
             *接著看current thread有沒有在ready 沒有的話就把他接到run queue 最後從sched_rt中的priority queue回傳queue的head 
             *也就是priority最高的thread*/

            if (next == curr && !xnthread_test_state(curr, XNRESTART)) {
                    /* Note: the root thread never restarts. */
                    if (unlikely(xnthread_test_state(next, XNROOT))) {
                            if (testbits(sched->lflags, XNHTICK))
                                    xnintr_host_tick(sched);
                                    /*如果next不是root 則把interrupt在傳給root*/
                            if (testbits(sched->lflags, XNHDEFER))
                                    xntimer_next_local_shot(sched);
                    }
                    goto signal_unlock_and_exit;
            }
            XENO_BUGON(NUCLEUS, need_resched == 0);
            prev = curr;
            trace_mark(xn_nucleus, sched_switch,
                       "prev %p prev_name %s "
                       "next %p next_name %s",
                       prev, xnthread_name(prev),
                       next, xnthread_name(next));

            #ifdef CONFIG_XENO_OPT_PERVASIVE
            shadow = xnthread_test_state(prev, XNSHADOW);
            #else
            (void)shadow;
            #endif /* CONFIG_XENO_OPT_PERVASIVE */
           
            if (xnthread_test_state(next, XNROOT)) {
                    xnsched_reset_watchdog(sched);
                    xnfreesync();
            }
            
            if (zombie)    /*如果接下來的thread是root就要重設watchdog timer*/
                    xnsched_zombie_hooks(prev);
            sched->curr = next;
            if (xnthread_test_state(prev, XNROOT))    /*判斷prev跟next是否為root 對進出root有特出的xnarch_enter(leave)_root*/
                    xnarch_leave_root(xnthread_archtcb(prev));
            else if (xnthread_test_state(next, XNROOT)) {
                    if (testbits(sched->lflags, XNHTICK))
                            xnintr_host_tick(sched);
                    if (testbits(sched->lflags, XNHDEFER))
                            xntimer_next_local_shot(sched);
                    xnarch_enter_root(xnthread_archtcb(next));
            }
            
            xnstat_exectime_switch(sched, &next->stat.account); /*將thread的csw加到xnstat 讓/porc/xenomai/stat可以看到xenomai thread的情況*/
            xnstat_counter_inc(&next->stat.csw);
            xnpod_switch_to(sched, prev, next);

            /*context switch*/

            #ifdef CONFIG_XENO_OPT_PERVASIVE
            /*
             * Test whether we transitioned from primary mode to secondary
             * over a shadow thread. This may happen in two cases:
             *
             * 1) the shadow thread just relaxed.
             * 2) the shadow TCB has just been deleted, in which case
             * we have to reap the mated Linux side as well.
             *
             * In both cases, we are running over the epilogue of Linux's
             * schedule, and should skip our epilogue code.
             */

            if (shadow && xnarch_root_domain_p())
                    goto shadow_epilogue;
            #endif /* CONFIG_XENO_OPT_PERVASIVE */
            switched = 1;
            sched = xnsched_finish_unlocked_switch(sched);
            
            /*
             * Re-read the currently running thread, this is needed
             * because of relaxed/hardened transitions.
             */

            curr = sched->curr;
            xnarch_trace_pid(xnthread_user_task(curr) ?
                             xnarch_user_pid(xnthread_archtcb(curr)) : -1,
                             xnthread_current_priority(curr));

            if (zombie)
                    xnpod_fatal("zombie thread %s (%p) would not die...",
                                prev->name, prev);
            xnsched_finalize_zombie(sched);    /*處理掉zombie*/
            __xnpod_switch_fpu(sched);
            #ifdef __XENO_SIM__
            if (nkpod->schedhook)
                    nkpod->schedhook(curr, XNRUNNING);
            #endif /* __XENO_SIM__ */
            xnpod_run_hooks(&nkpod->tswitchq, curr, "SWITCH");
            signal_unlock_and_exit:

            if (xnthread_signaled_p(curr))
                    xnpod_dispatch_signals();
            if (switched &&
                xnsched_maybe_resched_after_unlocked_switch(sched))
                    goto reschedule;
            if (xnthread_lock_count(curr))
                    __setbits(sched->lflags, XNINLOCK);
            xnlock_put_irqrestore(&nklock, s);   /*解鎖*/
            return;
            #ifdef CONFIG_XENO_OPT_PERVASIVE
            
            shadow_epilogue:
            
             /* Shadow on entry and root without shadow extension on exit?
               Mmmm... This must be the user-space mate of a deleted real-time
               shadow we've just rescheduled in the Linux domain to have it
               exit properly.  Reap it now. */

            if (xnshadow_thrptd(current) == NULL) {
                    splnone();
                    xnshadow_exit();
            }

            /* Interrupts must be disabled here (has to be done on entry of the
               Linux [__]switch_to function), but it is what callers expect,
               specifically the reschedule of an IRQ handler that hit before we
               call xnpod_schedule in xnpod_suspend_thread when relaxing a
               thread. */

            XENO_BUGON(NUCLEUS, !irqs_disabled_hw());
            return;
            #endif /* CONFIG_XENO_OPT_PERVASIVE */
            }

            EXPORT_SYMBOL_GPL(__xnpod_schedule);

觀察與分析
=========

.. code-block:: prettyprint

     pi@raspberrypi:~$ cat /proc/xenomai/stat
     CPU    PID        MSW                  CSW           PF         STAT            %CPU    NAME
     0      0            0                  206           0          00500080        100.0   ROOT
     0      0            0                  2688553       0          00000000        0.0     IRQ3: [timer]

* CPU : 目前這個tread是使用哪個CPU在運行,而rpi是單核心CPU,故顯示皆為0
* MSW : Mode SWitches, This value should only increase over time for threads that are expected to interact with Linux services.
  - 當process從primary mode轉成secondary mode或是secondary mode轉成primary mode時,將會紀錄一次的轉換。

  - cyclictest的RT task因為會執行到memset,所以會從xenomai schedule跳到linux schedule,MSW+1,而執行完memset後將在跳回xenomai schedule,故再+1

* CSW : Number of Context SWitches (or IRQ hits for the particular CPU)
* PF : Number of Page Faults (should stop increasing as soon as mlockall is in effect)
* STAT : A bitfield describing the internal state of the thread. Bit values are defined in include/nucleus/thread.h (See status and mode bits). The STAT field from /proc/xenomai/sched gives a 1-letter-per-bit symbolic translation of a the most significant subset of those bits.
* %CPU : CPU share of the thread (or IRQ handler) since the last retrieval of the statistics.
* NAME : Name of the thread (or IRQ number and registered driver). Can be set, e.g., with the (non portable) POSIX-API-function pthread_set_name_np. See API documentation of the RTOS skin in question.

.. code-block:: prettyprint

     pi@raspberrypi:~$ sudo /usr/xenomai/bin/cyclictest >/dev/null 2>/dev/null &
     [1] 2253

.. code-block:: prettyprint

     pi@raspberrypi:~$ ps aux | grep -i "cy"
     root      2253  0.5  0.3   4580   1464  ?        S    03:34   0:00   sudo /usr/xenomai/bin/cyclictest
     root      2254  2.7  0.4   2340   2132  ?        SLl  03:34   0:00   /usr/xenomai/bin/cyclictest
     pi        2259  0.0  0.1   3540   820   ttyAMA0  S+   03:34   0:00   grep --color=auto -i cy

.. code-block:: prettyprint

     pi@raspberrypi:~$ cat /proc/xenomai/stat
     CPU    PID        MSW                CSW              PF        STAT        %CPU    NAME
     0      0          0                  255              0         00500080    100.0   ROOT
     0      2254       1                  1                0         00b00380    0.0     cyclictest
     0      2256       2                  48               0         00300184    0.0     cyclictest
     0      0          0                  2913946          0         00000000    0.0     IRQ3: [timer]

.. code-block:: prettyprint

     pi@raspberrypi:~$ watch -n 1 cat /proc/xenomai/stat
     Every 1.0s: cat /proc/xenomai/stat                      Wed Jan  8 03:38:43 2014

     CPU    PID        MSW                CSW           PF        STAT         %CPU     NAME
     0      0          0                  442           0         00500080     99.9     ROOT
     0      2254       1                  1             0         00b00380     0.0      cyclictest
     0      2256       2                  235           0         00300184     0.0      cyclictest
     0      0          0                  2953543       0         00000000     0.1      IRQ3: [timer]

在這邊可以看到cyclictest有兩個pid,因為/usr/xenomai/bin/cyclictest它會先創一個thread,並讓這個thread跑nanosleep,所以會有兩個process。接著看向CSW,pid 2254的cyclictest, 他的CSW只有1。pid 2256的卻有235,這是因為2256是一個xenomai realtime task,而 2254是一個 linux的process,所以2256會優先執行,直到realtime task都做完才會換low priority的linux domain process取得CPU,因此2254的CSW值才會是1而沒有增加。

.. code-block:: prettyprint

     pi@raspberrypi:~$ sudo kill 2254

     pi@raspberrypi:~$ ps aux | grep -i "cy"
     pi        2324  0.0  0.1   3540   820 ttyAMA0  R+   03:46   0:00 grep --color=auto -i cy
     [1]+  Done                    sudo /usr/xenomai/bin/cyclictest > /dev/null 2> /dev/null

     pi@raspberrypi:~$ sudo /usr/xenomai/bin/cyclictest -p FIFO >/dev/null 2>/dev/null &

* 在我們了解MSW時,嘗試了在-p後面加上了文字(如:FIFO、RR……)  

* 發現MSV的值開始往上增加,也發現一開始對於MSW的定義理解錯誤

.. code-block:: prettyprint

      CPU    PID        MSW                CSW                PF         STAT          %CPU     NAME
      0      0          0                  75266              0          00500080      99.9     ROOT
      0      2978       1                  1                  0          00b00380      0.0      cyclictest
      0      2980       2                  26846              0          00300184      0.0      cyclictest
      0      7559       1                  1                  0          00b00380      0.0      cyclictest
      0      7561       66                 130                0          00b00184      0.0      cyclictest
      0      0          0                  11266931           0          00000000      0.1      IRQ3: [timer]

* trace後才了解,這是xenomai在-p的指令上是使用atoi,將輸入的數字轉為int,但並沒有進行偵錯,才導致segment fault,而需跳轉到linux domain進行除錯。

效能表現
=======

Cyclictest
+++++++++++

* 概念:設定一個時間間隔->取得現在時間->讓process 睡一個間隔->process醒來後再取一次時間->比對兩次取得的時間差與設定的間隔差距
* pseudocode:

  .. code-block:: prettyprint

    clock_gettime((&now))
    next = now + par->interval
    while (!shutdown) {
        clock_nanosleep((&next))
        clock_gettime((&now))
        diff = calcdiff(now, next)
        # update stat-> min, max, total latency, cycles
        # update the histogram data
        next += interval
    }

* 造成時間差的原因
  - timer精準度
  - IRQ latency
  - IRQ handler duration
  - scheduler latency
  - scheduler duration

* 實作流程

  1.cyclictest建立一個timerthread 這是一個realtime 的 thread

  2.timerthread會重複的執行取第一次時間 nanosleep(interval) 取第二次時間 比對兩次時間差與interval的差異

  3.將結果輸出在terminal

  -  在cylictic test 的main函式使用posix skin的api 建立timerthread  

  .. code-block:: prettyprint

      pthread_create(&stat[i].thread, &thattr, timerthread, &par[i]);    //pthread_create()定義在ksrc/skins/posix/thread.c

  .. code-block:: prettyprint

          int pthread_create(pthread_t *tid,
                       const pthread_attr_t * attr,
                       void *(*start) (void *), void *arg){

            union xnsched_policy_param param;  /**使用schedule的參數 可能是sched_rt_param 或 sched_idle_param(這兩個的參數都只有一個int prio)/
            struct xnthread_start_attr sattr;
            struct xnthread_init_attr iattr;
            pthread_t thread, cur; /* *pthread_t = pse51thread , pse51thread由xnthread_t與其他成員組成*/
            xnflags_t flags = 0;
            size_t stacksize;
            const char *name;
            int prio, ret;
            spl_t s;

            if (attr && attr->magic != PSE51_THREAD_ATTR_MAGIC)
                    return EINVAL;
            /*下面開始分配thread空間 初始化thread */

            thread = (pthread_t)xnmalloc(sizeof(*thread));

            if (!thread)
                    return EAGAIN;

            thread->attr = attr ? *attr : default_attr;
            cur = pse51_current_thread();
            if (thread->attr.inheritsched == PTHREAD_INHERIT_SCHED) {

                    /* cur may be NULL if pthread_create is not called by a pse51
                       thread, in which case trying to inherit scheduling
                       parameters is treated as an error. */
                    /*這邊應該是說 如果不是用xnthread做出來的pse51 thread而是linux本身的posix pthread 則當作error*/
                    if (!cur) {
                            xnfree(thread);
                            return EINVAL;
                    }
                    pthread_getschedparam_ex(cur, &thread->attr.policy,
                                             &thread->attr.schedparam_ex);
            }

            prio = thread->attr.schedparam_ex.sched_priority;
            stacksize = thread->attr.stacksize;
            name = thread->attr.name;
            if (thread->attr.fp)
                    flags |= XNFPU;
            if (!start)
                    flags |= XNSHADOW;        /* Note: no interrupt shield. */
            iattr.tbase = pse51_tbase;
            iattr.name = name;
            iattr.flags = flags;
            iattr.ops = &pse51_thread_ops;
            iattr.stacksize = stacksize;
            param.rt.prio = prio;
            hread->arg = arg;
            xnsynch_init(&thread->join_synch, XNSYNCH_PRIO, NULL);
            thread->nrt_joiners = 0;
            pse51_cancel_init_thread(thread);
            pse51_signal_init_thread(thread, cur);
            pse51_tsd_init_thread(thread);
            pse51_timer_init_thread(thread);

            if (thread->attr.policy == SCHED_RR)
                    xnpod_set_thread_tslice(&thread->threadbase, pse51_time_slice);
            xnlock_get_irqsave(&nklock, s);
            thread->container = &pse51_kqueues(0)->threadq;
            appendq(thread->container, &thread->link);
            xnlock_put_irqrestore(&nklock, s);
            #ifdef CONFIG_XENO_OPT_PERVASIVE
            thread->hkey.u_tid = 0;
            thread->hkey.mm = NULL;
            #endif /* CONFIG_XENO_OPT_PERVASIVE */
            /* We need an anonymous registry entry to obtain a handle for fast
               mutex locking. */
            ret = xnthread_register(&thread->threadbase, "");
            if (ret) {
                    thread_destroy(thread);
                    return ret;
            }
            *tid = thread;                /* Must be done before the thread is started. */

            /* Do not start shadow threads (i.e. start == NULL). */

            if (start) {

                    sattr.mode = 0;
                    sattr.imask = 0;
                    sattr.affinity = thread->attr.affinity;
                    sattr.entry = thread_trampoline;
                    sattr.cookie = thread;
                    xnpod_start_thread(&thread->threadbase, &sattr);
            }
            return 0;
             }

  - 建立一個thread叫timerthread,timerthread主要做的事情是呼叫clock_nanosleep這個function

  .. code-block:: prettyprint

        int clock_nanosleep(clockid_t clock_id,
                        int flags,
                        const struct timespec *rqtp, struct timespec *rmtp){

            xnthread_t *cur;
            spl_t s;
            int err = 0;
            if (xnpod_unblockable_p())
                    return EPERM;
            if (clock_id != CLOCK_MONOTONIC && clock_id != CLOCK_REALTIME)
                    return ENOTSUP;
            if ((unsigned long)rqtp->tv_nsec >= ONE_BILLION)
                    return EINVAL;
            if (flags & ~TIMER_ABSTIME)
                    return EINVAL;
            cur = xnpod_current_thread();
            xnlock_get_irqsave(&nklock, s);
            thread_cancellation_point(cur);
            xnpod_suspend_thread(cur, XNDELAY, ts2ticks_ceil(rqtp) + 1,
                                 clock_flag(flags, clock_id), NULL);
            thread_cancellation_point(cur);
            if (xnthread_test_info(cur, XNBREAK)) {
                    if (flags == 0 && rmtp) {
                            xnsticks_t rem;
                            rem = xntimer_get_timeout_stopped(&cur->rtimer);
                            xnlock_put_irqrestore(&nklock, s);
                            ticks2ts(rmtp, rem > 1 ? rem : 0);
                    } else
                            xnlock_put_irqrestore(&nklock, s);
                    return EINTR;
            }
            xnlock_put_irqrestore(&nklock, s);
            return err;
        }

  clock_nanosleep主要用的api有下面幾個

  - xnpod_current_thread

  .. code-block:: prettyprint

      
      #define xnpod_current_thread() \
        (xnpod_current_sched()->curr)

      #define xnpod_current_sched() \
        xnpod_sched_slot(xnarch_current_cpu())

      #define xnpod_sched_slot(cpu) \
        (&nkpod->sched[cpu])      //可以發現最後取得的東西是&nkpod->sched[current_cpu]->curr
      
  - xnarch_current_cpu
  
  .. code-block:: prettyprint

      static inline unsigned xnarch_current_cpu(void)
      {
            return rthal_processor_id();    //xnarch東西跟平台有關 因此會接到rthal這個abstraction layout
      }

  - rthal_processor_id()

  .. code-block:: prettyprint

      #difine rthal_processor_id() ipipe_processor_id()
      #define ipipe_processor_id()  (0) //因為rapsberry pi只有一顆cpu 沒有SMP 所以會是使用這個macro

  在這邊就能看到關於硬體的會從xnarch -> rthal -> ipipe

  - xnlock_get_irqsave

  .. code-block:: prettyprint

      #define xnlock_get_irqsave(lock,x) \
            ((x) = __xnlock_get_irqsave(lock  XNLOCK_DBG_CONTEXT))
      static inline spl_t
      __xnlock_get_irqsave(xnlock_t *lock /*, */ XNLOCK_DBG_CONTEXT_ARGS)

      {
            unsigned long flags;
            rthal_local_irq_save(flags);
            if (__xnlock_get(lock /*, */ XNLOCK_DBG_PASS_CONTEXT))
                    flags |= 2;        /* Recursive acquisition */
            return flags;
      }

  - rthal_local_irq_save

  .. code-block:: prettyprint

      #define rthal_local_irq_save(x) ((x) = ipipe_test_and_stall_pipeline_head() & 1)

      static inline unsigned long ipipe_test_and_stall_pipeline_head(void)
      {
            return ipipe_test_and_stall_head();
      }
      static inline unsigned long ipipe_test_and_stall_head(void)
      {
            hard_local_irq_disable();
            return __test_and_set_bit(IPIPE_STALL_FLAG, &__ipipe_head_status);
      }

  - hard_local_irq_disable

  .. code-block:: prettyprint

      static inline void hard_local_irq_disable_notrace(void)
      {

      #if __LINUX_ARM_ARCH__ >= 6

            __asm__("cpsid i        @ __cli" : : : "memory", "cc");

      #else /* linux arch <= 5 */

            unsigned long temp;
            __asm__ __volatile__(
                    "mrs        %0, cpsr                @ hard_local_irq_disable\n"
                    "orr        %0, %0, #128\n"
                    "msr        cpsr_c, %0"
                    : "=r" (temp)
                    :
                    : "memory", "cc");
      #endif /* linux arch <= 5 */
      }

  - __xnlock_get

  .. code-block:: prettyprint

      #define xnlock_get(lock) __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
      static inline int __xnlock_get(xnlock_t *lock /*, */ XNLOCK_DBG_CONTEXT_ARGS)
      {
            unsigned long long start;
            int cpu = xnarch_current_cpu();
            if (atomic_read(&lock->owner) == cpu)
                    return 1;
            xnlock_dbg_prepare_acquire(&start);
            if (unlikely(atomic_cmpxchg(&lock->owner, ~0, cpu) != ~0))
                    __xnlock_spin(lock /*, */ XNLOCK_DBG_PASS_CONTEXT);
            xnlock_dbg_acquired(lock, cpu, &start /*, */ XNLOCK_DBG_PASS_CONTEXT);
            return 0;
      }

  - thread_cancellation_point
   
  .. code-block:: prettyprint

      static inline void thread_cancellation_point (xnthread_t *thread)
      {
        pthread_t cur = thread2pthread(thread);
        if(cur && cur->cancel_request
            && thread_getcancelstate(cur) == PTHREAD_CANCEL_ENABLE )
            pse51_thread_abort(cur, PTHREAD_CANCELED);
      }

      void pse51_thread_abort(pthread_t thread, void *status)
      {
            thread_exit_status(thread) = status;
            thread_setcancelstate(thread, PTHREAD_CANCEL_DISABLE);
            thread_setcanceltype(thread, PTHREAD_CANCEL_DEFERRED);
            xnpod_delete_thread(&thread->threadbase);
      }

  - xnpod_delete_thread
  - xnpod_suspend_thread
  - xnthread_test_info
  - xnlock_put_irqstore
  
  .. code-block:: prettyprint

      static inline void xnlock_put_irqrestore (xnlock_t *lock, spl_t flags) 
      {
        /* Only release the lock if we didn't take it recursively. */
        if (!(flags & 2)) 
            xnlock_put (lock);
        rthal_local_irq_restore (flags & 1); 
      }

      static inline void xnlock_put (xnlock_t *lock) 
      {
        if (xnlock_dbg_release(lock))
            return;
        /*
         * Make sure all data written inside the lock is visible to
         * other CPUs before we release the lock.
         */

        xnarch_memory_barrier();
        atomic_set(&lock->owner, ~0);

      }   

      static inline int xnlock_dbg_release(xnlock_t *lock)
      {
        extern xnlockinfo_t xnlock_stats[];
        unsigned long long lock_time = rthal_rdtsc() - lock->lock_date;
        int cpu = xnarch_current_cpu();
        xnlockinfo_t *stats = &xnlock_stats[cpu];
        if (unlikely(atomic_read(&lock->owner) != cpu)) {
            rthal_emergency_console();
            printk(KERN_ERR "Xenomai: unlocking unlocked nucleus lock %p"
                    " on CPU #%d\n"
                    "         owner  = %s:%u (%s(), CPU #%d)\n",
                   lock, cpu, lock->file, lock->line, lock->function,
                   lock->cpu);
            show_stack(NULL,NULL);
            return 1;
        }
        lock->cpu = -lock->cpu; /* File that we released it. */
        if (lock_time > stats->lock_time) {
            stats->lock_time = lock_time;
            stats->spin_time = lock->spin_time;
            stats->file = lock->file;
            stats->function = lock->function;
            stats->line = lock->line;
        }
        return 0;
      }

      #define xnarch_memory_barrier() __sync_synchronize() 
      #define rthal_local_irq_restore(x)  ipipe_restore_pipeline_head(x)   

      static inline xnticks_t xntimer_get_timeout_stopped (xntimer_t *timer)
      {
        return timer->base->ops->get_timer_timeout (timer);
      }

* `Cyclictest<https://rt.wiki.kernel.org/index.php/Cyclictest>`_
* Test case: POSIX interval timer, Interval 500 micro seconds,. 100000 loops, 100% load.
  - Commandline: cyclictest -t1 -p 80 -i 500 -l 100000

* 使用 PREEMPT LINUX

.. code-block:: prettyprint

    root@raspberrypi:/home/pi# sudo ./cyclictest -t1 -p 80 -i 500 -l 100000
    # /dev/cpu_dma_latency set to 0us
    policy: fifo: loadavg: 0.00 0.01 0.05 1/61 2064          
    T: 0 ( 2063) P:80 I:500 C: 100000 Min:     27 Act:   49 Avg:   42 Max:    1060

* 使用 RT-PREEMPT

.. code-block:: prettyprint

    Linux raspberrypi 3.6.11+ #474 PREEMPT Thu Jun 13 17:14:42 BST 2013 armv6l GNU/Linux
    Min:     22 Act:   31 Avg:   32 Max:     169

* 使用 Xenomai

.. code-block:: prettyprint

    Linux raspberrypi 3.8.13-core+ #1 Thu Feb 27 03:02:16 CST 2014 armv6l GNU/Linux
    Min:      1 Act:    5 Avg:    6 Max:      41

.. code-block:: prettyprint

    root@raspberrypi:/home/pi# /usr/xenomai/bin/cyclictest -t1 -p 80 -i 500 -l 10000 
    0.08 0.06 0.05 1/61 2060          
    T: 0 ( 2060) P:80 I:     500 C:  100000 Min:      -4 Act:      -2 Avg:       0 Max:      30

T:thread

P:priority

I:interval

C:執行cycle數

Min:最小延遲

Act:此次延遲時間

Avg:平均延遲

Max:最大延遲

最重要的是Max值 為了確保realtime 要能知道worst case 

讓開發者可以評估最差的情況可以在多少時間內可以做出回應

Hackpad
=======
*  討論&紀錄 https://embedded2014.hackpad.com/Xenomai-raspberry-note-XwJtuQn9nkD

*  整理 https://embedded2014.hackpad.com/Xenomai-z2CJPjPLTer

組員
====
* 向澐
* 林家宏
* 呂科進
* 趙愷文
* 阮志偉
* 陳建霖


參考資料
=======
* https://code.google.com/p/picnc/wiki/RPiXenomaiKernel
* https://code.google.com/p/picnc/wiki/CreateRaspbianLinuxCNC
* http://www.camelsoftware.com/firetail/blog/raspberry-pi/real-time-operating-systems/
* `Quadruped Linux robot feels its way over obstacles<http://linuxgizmos.com/hyq-quadruped-robot-runs-real-time-linux/>`_
* ` Choosing between Xenomai and Linux for real-time applications<https://www.osadl.org/fileadmin/dam/rtlws/12/Brown.pdf>`_
* `Real Time Systems<http://www.slideshare.net/anil_pugalia/real-time-systems>`_
* http://www.cs.ru.nl/lab/xenomai/exercises/