2023-02-26

信号处理函数是如何返回的(2)?

前言:

书接上回, 我们介绍了信号栈帧, 也就是下面这个东西:

进入信号处理函数时栈的结构

上节说的太宽泛, 这次我们将更细节地讨论这件事。

内核压入的栈帧, 哪些信息才是有用的?

上节提到, 内核要插入信号处理函数时, 其实只是将一些必要的上下文存储在了信号处理函数的栈上。这些信息就是如下的结构体:

typedef struct ucontext_t
{
    unsigned long int uc_flags;
    struct ucontext_t *uc_link;
    stack_t uc_stack;
    mcontext_t uc_mcontext;
    sigset_t uc_sigmask;
} ucontext_t;

其中uc_mcontext就是所有的寄存器上下文(包括浮点环境的所有上下文), 它其实是struct sigcontext的等价结构。但是前者晦涩难读, 考虑到可读性, 我们一般将其转化为struct sigcontext。下面的程序打印了这个寄存器上下文。

#define _GNU_SOURCE

#include <ucontext.h>
#include <signal.h>
#include <malloc.h>

void handler(int signo, siginfo_t *info, ucontext_t *ctx) {
    struct sigcontext *sigctx = (void*)&ctx->uc_mcontext;
    printf("%lu\n", sigctx->cr2);
    printf("%lu\n", sigctx->cs);
    printf("%lu\n", sigctx->eflags);
    printf("%lu\n", sigctx->err);
    printf("%lu\n", sigctx->__fpstate_word);
    printf("%lu\n", sigctx->fs);
    printf("%lu\n", sigctx->gs);
    printf("%lu\n", sigctx->oldmask);
    printf("%lu\n", sigctx->r10);
    printf("%lu\n", sigctx->r11);
    printf("%lu\n", sigctx->r12);
    printf("%lu\n", sigctx->r13);
    printf("%lu\n", sigctx->r14);
    printf("%lu\n", sigctx->r15);
    printf("%lu\n", sigctx->r8);
    printf("%lu\n", sigctx->r9);
    printf("%lu\n", sigctx->rax);
    printf("%lu\n", sigctx->rbx);
    printf("%lu\n", sigctx->rcx);
    printf("%lu\n", sigctx->rdi);
    printf("%lu\n", sigctx->rdx);
    printf("%lu\n", sigctx->rip);
    printf("%lu\n", sigctx->rsi);
    printf("%lu\n", sigctx->rsp);
    printf("%lu\n", sigctx->trapno);
}

int main() {
    // set stack for signal handler
    stack_t s;
    s.ss_flags = 0;
    s.ss_size = 40000;
    s.ss_sp = malloc(40000);
    sigaltstack(&s, NULL);

    // set up signal handler
    struct sigaction act;
    act.sa_flags = SA_SIGINFO|SA_ONSTACK;
    sigfillset(&act.sa_mask);
    act.sa_sigaction = (void (*)(int, siginfo_t*,void*))handler;
    sigaction(SIGINT, &act, NULL);

    while(1) {

    }
}

/* output:
    省略, 这个输出没啥意义, 就是一些上下文信息
*/

我们来说明struct sigaction里面的字段意义。oldmask保存了之前的信号屏蔽字, 用于恢复中断之前的信号屏蔽字。rax~rdx, r8~r15, fs, gs, rsi, rsp, eflags这些寄存器不必多说, 它们都是被程序广泛使用的寄存器, 必须得恢复。rip保存了之前的PC(就是之前指令的地址), 用于恢复到之前被中断的地方。rsp保存了中断前的栈指针, 用于恢复栈环境。

关于cs寄存器我想说的是, 一个用户态程序, 在任何时候都不应该修改cs(gs,fs, ss…等所有段寄存器), 一个进程的cs寄存器永远是0x33。然而, 我们却可以使用jmp cs:rip这样的指令修改cs。但如果这么做了, 那么cpu在执行下一行代码的时候, 就将发生权限错误(如果好奇, 可以学习下cpu的权限保护机制, 笔者对64位的权限机制知之甚少), 内核就会发送相关信号给程序。

这里的cr2是干嘛的? 难道他们也需要被恢复? 答案是否定的, 我们不需要恢复它们, 而且用户态程序也没有权限写它。下面是一个demo, 尝试写cr2寄存器:

#include <stdio.h>

int main() {
    __asm__ __volatile(
        "mov $0, %%rax\n\t"
        "mov %%rax, %%cr2"
        :
        :
        :
    );
}

// './main' terminated by signal SIGSEGV (Address boundary error)

为什么没法写? 因为用户程序在ring3低特权级。在尝试执行这个指令时, cpu将提供保护机制, 此时发生一个异常, 由内核接管一切。接着内核将发送信号给程序。想想, 当我们的程序除0时, 也是由cpu引发异常, 然后发送一个特定的信号, 它们的原理是一样的。区别在于, 写cr0是由于权限引发的cpu异常, 除0是由于运算逻辑引发的cpu异常, 它们其实很类似。

所以这里我有个疑问, 为什么要把cs和cr2这样我们明显没法改(或者说不应该改)的东西放在sigcontext里? 我觉得这是不必要的, 如果有同学知道, 求求告诉我一下。

在发送段错误时, cr2寄存器保存最后一次出现页故障时访问的地址。我们可以从sigcontext里面拿出这个cr2的信息。也可以用siginfo_t中保存的信息获得这个, 见下面的代码:

#define _GNU_SOURCE

#include <ucontext.h>
#include <signal.h>
#include <malloc.h>


void handler(int signo, siginfo_t *info, ucontext_t *ctx) {
    struct sigcontext *sigctx = (void*)&ctx->uc_mcontext;
    printf("%p %p\n", (*info).si_addr, sigctx->cr2);
}

int main() {
    // set stack for signal handler
    stack_t s;
    s.ss_flags = 0;
    s.ss_size = 40000;
    s.ss_sp = malloc(40000);
    sigaltstack(&s, NULL);

    // set up signal handler
    struct sigaction act;
    act.sa_flags = SA_SIGINFO|SA_ONSTACK;
    sigfillset(&act.sa_mask);
    act.sa_sigaction = (void (*)(int, siginfo_t*,void*))handler;
    sigaction(SIGSEGV, &act, NULL);

    *(char*)(0x123) = 'c';
    
}

// output:
// 循环输出0x123 0x123

trapno, err, 这个两个字段与硬件中断有关, 来自上一次硬件中断的信息, 也不是被恢复的信息(这两个字段也许对高手有用, 但我不知道有什么作用, 如果有人知道, 求求告诉我一下)。

上面说的都是sigcontext, 但那只是内核压入的一部分, 它压入的全部被定义在了ucontext_t里面, 里面除了uc_mcontext还有好几个字段, 它们是干嘛的? 答案是, 它们没有任何卵用。这些信息都是冗余信息, 内核并不搭理它们, 只是将它们保存了下来, 下次信号中断时, 又将它们填入。

重头戏, 高端操作, 利用栈溢出在信号处理函数中原子跳转+设置信号屏蔽字

#define _GNU_SOURCE

#include <ucontext.h>
#include <signal.h>
#include <malloc.h>
#include <alloca.h>
#include <memory.h>

void move_here() {
    while(1) {
        printf("芜湖\n");
        printf("起飞\n");
    }
}

void gadget()
{
    asm("mov $0xf,%rax\n");
    asm("retq\n");
}

ucontext_t uctx;


void sigreturnto(void (*dest)(), void *stack_top) {
    unsigned long *ret;
    bzero(&uctx, sizeof(ucontext_t));
    uctx.uc_mcontext.gregs[REG_RIP] = dest;
    uctx.uc_mcontext.gregs[REG_RSP] = stack_top;
    uctx.uc_mcontext.gregs[REG_CSGSFS] = 0x33;
    ret = (unsigned long*)&ret + 2;
    *ret = gadget + 4;
    *(ret+1) = dest;
    memcpy(ret + 2, &uctx, sizeof(ucontext_t));
}

int main() {
    sigreturnto(move_here, malloc(4096)+4096);
}

这里利用了栈溢出的一个小技巧, 通过精心设计的栈溢出设置了栈帧, 并且调用到了15号系统调用。这种做法其实是一种攻击手段, 被称为sigreturn-oriented programming(SROP), 如果对这种手段感兴趣, 参考资料提供了几个链接供查阅学习。

下面的代码更为直观, 不采用栈溢出的这种高端操作:

#define _GNU_SOURCE

#include <pthread.h>
#include <ucontext.h>
#include <signal.h>
#include <unistd.h>
#include <time.h>
#include <malloc.h>
#include <memory.h>

void sigreturn_to(ucontext_t *ctx) {
    register void *rsp __asm__("rsp");
    rsp -= sizeof(ucontext_t);
    *(ucontext_t*)rsp = *ctx;
    __asm__ __volatile__(
        "mov $15, %%rax\n\t"
        "syscall"
        :
        :
        :
    );
}

void come_here() {
    while (1) {
        printf("123\n");
    }
}

int main() {
    void *stack_top = malloc(4096) + 4096;

    ucontext_t ctx;
    bzero(&ctx, sizeof(ucontext_t));
    ctx.uc_mcontext.gregs[REG_RIP] = come_here;
    ctx.uc_mcontext.gregs[REG_CSGSFS] = 0x33;
    ctx.uc_mcontext.gregs[REG_RSP] = stack_top;

    sigreturn_to(&ctx);
}

你可以用它做什么?

玩。