0x01 OpCode

opcode又称为操作码，是将python源代码进行编译之后的结果，python虚拟机无法直接执行human-readable的源代码，因此python编译器第一步先将源代码进行编译，以此得到opcode。例如在执行python程序时一般会先生成一个pyc文件，pyc文件就是编译后的结果，其中含有opcode序列。

如何查看一个函数的OpCode?

def a():
 if 1 == 2:
  print("flag{****}")

print "Opcode of a():",a.__code__.co_code.encode('hex')

通过此方法我们可以得到a函数的OpCode

co_code string of raw compiled bytecode

Opcode of a(): 6401006402006b020072140064030047486e000064000053

我们可以通过dis库获得相应的解析结果。

1
2
3

import dis

dis.dis('6401006402006b020072140064030047486e000064000053'.decode('hex'))

得到反编译的结果

> 0 LOAD_CONST     1 (1)
>     3 LOAD_CONST     2 (2)
>     6 COMPARE_OP     2 (==)
>     9 POP_JUMP_IF_FALSE  20
>    12 LOAD_CONST     3 (3)
>    15 LOAD_BUILD_CLASS
>    16 YIELD_FROM   
>    17 JUMP_FORWARD    0 (to 20)
>  \>>  20 LOAD_CONST     0 (0)
>    23 RETURN_VALUE

常见的字节码指令

为了进一步研究OpCode，我们可以对dis的disassemble_string函数进行patch

在124行加入

1	print hex(op).ljust(6),

可以查看具体的字节码。

> 0 LOAD_CONST      0x64    1 (1)
>     3 LOAD_CONST      0x64    2 (2)
>     6 COMPARE_OP      0x6b    2 (==)
>     9 POP_JUMP_IF_FALSE  0x72   20
>    12 LOAD_CONST      0x64    3 (3)
>    15 LOAD_BUILD_CLASS   0x47 
>    16 YIELD_FROM      0x48 
>    17 JUMP_FORWARD     0x6e    0 (to 20)
>  \>>  20 LOAD_CONST      0x64    0 (0)
>    23 RETURN_VALUE     0x53
>

变量

指令名	操作
LOAD_GLOBAL	读取全局变量
STORE_GLOBAL	给全局变量赋值
LOAD_FAST	读取局部变量
STORE_FAST	给局部变量赋值
LOAD_CONST	读取常量

IF

指令名	操作
POP_JUMP_IF_FALSE	当条件为假的时候跳转
JUMP_FORWARD	直接跳转

CMP_OP

1	cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is','is not', 'exception match', 'BAD')

其余的指令参考OpCode源码

0x02 利用OpCode改变程序运行逻辑

在Python中，我们可以对任意函数的__code__参数进行赋值，通过对其进行赋值，我们可以改变程序运行逻辑。

Example1

1
2
3

def a():
 if 1 == 2:
  print("flag{****}")

在沙箱环境中我们需要调用这个函数，但是此函数我们无法执行到print语句。因此我们需要通过某种方法得到flag

Solution 1

我们直接获取a.code.co_consts，查看所有的常量。即可知道flag

1	(None, 1, 2, 'flag{****}')

Solution 2

更改程序运行逻辑

CodeType构造函数

1
2
3

def __init__(self, argcount, nlocals, stacksize, flags, code,
     consts, names, varnames, filename, name, 
     firstlineno, lnotab, freevars=None, cellvars=None):

上述函数其余参数均可通过__code.__.co_xxx获得

因此我们

def a():
 if 1 == 2:
  print("flag{****}")

for name in dir(a.__code__):
 print name,getattr(a.__code__,name)

输出

> co_argcount 0
>  co_cellvars ()
>  co_code ddkrdGHndS
>  co_consts (None, 1, 2, 'flag{****}')
>  co_filename example1.py
>  co_firstlineno 1
>  co_flags 67
>  co_freevars ()
>  co_lnotab
>
> co_name a
>  co_names ()
>  co_nlocals 0
>  co_stacksize 2
>  co_varnames ()

构造相应目标代码

def a():
 if 1 != 2:
  print("flag{****}")

print "Opcode of a():",a.__code__.co_code.encode('hex')

得到code

6401006402006b030072140064030047486e000064000053

构造payload

def a():
 if 1 == 2:
  print("flag{****}")

newcode = type(a.__code__)
code = "6401006402006b030072140064030047486e000064000053".decode('hex')
code = newcode(0,0,2,67,code,(None, 1, 2, 'flag{****}'),(),(),"xxx","a",1,"")
a.__code__ = code

a()

即可输出flag

Example 2

def target(flag):
 def printflag():
  if flag == "":
   print flag
 return printflag

flag = target("flag{*******}")

这一次因为是通过变量传入参数，我们无法通过上一次读co_consts获得变量。但是我们这次依旧可以通过重写code获得flag。

构造替代函数

def target(flag):
 def printflag():
  if flag != "":
   print flag
 return printflag
a = target("xxx")
import types
code = a.__code__.co_code.encode('hex')
print code

EXP

newcode = type(flag.__code__)
code = "8800006401006b030072140088000047486e000064000053".decode('hex')
code = newcode(0,0,2,19,code,(None, ''),(),(),"example2.py","printflag",2,"",('flag',),())
flag.__code__ = code
flag()

> ➜ python example2exp.py
>  8800006401006b030072140088000047486e000064000053
>  ➜ python example2.py  
>  flag{*******}

谷谷点程序 » 使用OpCode绕过Python沙箱的方法详解