浅谈Airtest图像识别原理-京东云开发者社区

### 1 Airtest简介

Airtest是一个跨平台的、基于图像识别的UI自动化测试框架，适用于游戏和App，支持平台有Windows、Android和iOS。Airtest框架基于一种图形脚本语言Sikuli，引用该框架后，不再需要一行行的写代码，通过截取按钮或输入框的图片，用图片组成测试场景，这种方式更加简单易上手。

### 2 Airtest实践

AirtestIDE的主界面由菜单栏、快捷工具栏和多个窗口组成，初始布局中的“设备窗口”是工具的设备连接交互区域。

![](//img1.jcloudcs.com/developer.jdcloud.com/814655fd-a5da-4522-a34d-b53dcf13a2cf20220811175630.png)

以下是对京管家UI自动化脚本进行的改造：
操作步骤：
1. 通过ADB连接一台安卓手机
2. 安装应用APK
3. 运行应用并截图
4. 模拟用户输入（点击、滑动、按键）
5. 卸载应用

![](//img1.jcloudcs.com/developer.jdcloud.com/22336135-fa21-4636-a51f-3e2214f393a520220811175645.png)

.air脚本运行方式：

```python
# run automated cases and scenarios on various devices
> airtest run "path to your .air dir" --device Android:///
> airtest run "path to your .air dir" --device Android://adbhost:adbport/serialno
> airtest run "path to your .air dir" --device Windows:///?title_re=Unity.*
> airtest run "path to your .air dir" --device iOS:///
...
# show help
> airtest run -h
usage: airtest run [-h] [--device [DEVICE]] [--log [LOG]]
                   [--recording [RECORDING]]
                   script

positional arguments:
  script                air path

optional arguments:
  -h, --help            show this help message and exit
  --device [DEVICE]     connect dev by uri string, e.g. Android:///
  --log [LOG]           set log dir, default to be script dir
  --recording [RECORDING]
                      record screen when running
  --compress
                      set snapshot quality, 1-99
  --no-image [NO_IMAGE]
                      Do not save screenshots
```

.air脚本生成报告的方式：

```python
> airtest report "path to your .air dir"
log.html
> airtest report -h
usage: airtest report [-h] [--outfile OUTFILE] [--static_root STATIC_ROOT]
                      [--log_root LOG_ROOT] [--record RECORD [RECORD ...]]
                      [--export EXPORT] [--lang LANG]
                      script

positional arguments:
  script                script filepath

optional arguments:
  -h, --help            show this help message and exit
  --outfile OUTFILE     output html filepath, default to be log.html
  --static_root STATIC_ROOT
                        static files root dir
  --log_root LOG_ROOT   log & screen data root dir, logfile should be
                        log_root/log.txt
  --record RECORD [RECORD ...]
                        custom screen record file path
  --export EXPORT       export a portable report dir containing all resources
  --lang LANG           report language
```

### 3 Airtest定位方式解析
以touch方法为例，解析Airtest如何通过图片获取到元素位置从而触发点击操作。

```python
@logwrap
def touch(v, times=1, **kwargs):
    """
    Perform the touch action on the device screen

:param v: target to touch, either a ``Template`` instance or absolute coordinates (x, y)
    :param times: how many touches to be performed
    :param kwargs: platform specific `kwargs`, please refer to corresponding docs
    :return: finial position to be clicked, e.g. (100, 100)
    :platforms: Android, Windows, iOS
    """
    if isinstance(v, Template):
        pos = loop_find(v, timeout=ST.FIND_TIMEOUT)
    else:
        try_log_screen()
        pos = v
    for _ in range(times):
        G.DEVICE.touch(pos, **kwargs)
        time.sleep(0.05)
    delay_after_operation()
    return pos

click = touch  # click is alias of touch
```

该方法通过loop_find获取坐标，然后执行点击操作 G.DEVICE.touch(pos, **kwargs)，接下来看loop_find如何根据模板转换为坐标。

```python
@logwrap
def loop_find(query, timeout=ST.FIND_TIMEOUT, threshold=None, interval=0.5, intervalfunc=None):
    """
    Search for image template in the screen until timeout

Args:
        query: image template to be found in screenshot
        timeout: time interval how long to look for the image template
        threshold: default is None
        interval: sleep interval before next attempt to find the image template
        intervalfunc: function that is executed after unsuccessful attempt to find the image template

Raises:
        TargetNotFoundError: when image template is not found in screenshot

Returns:
        TargetNotFoundError if image template not found, otherwise returns the position where the image template has
        been found in screenshot

"""
    G.LOGGING.info("Try finding: %s", query)
    start_time = time.time()
    while True:
        screen = G.DEVICE.snapshot(filename=None, quality=ST.SNAPSHOT_QUALITY)

if screen is None:
            G.LOGGING.warning("Screen is None, may be locked")
        else:
            if threshold:
                query.threshold = threshold
            match_pos = query.match_in(screen)
            if match_pos:
                try_log_screen(screen)
                return match_pos

if intervalfunc is not None:
            intervalfunc()

# 超时则raise，未超时则进行下次循环:
        if (time.time() - start_time) > timeout:
            try_log_screen(screen)
            raise TargetNotFoundError('Picture %s not found in screen' % query)
        else:
            time.sleep(interval)
```

首先截取手机屏幕match_pos = query.match_in(screen)，然后对比传参图片与截屏来获取图片所在位置match_pos = query.match_in(screen)。接下来看match_in方法的逻辑：

```python
def match_in(self, screen):
    match_result = self._cv_match(screen)
    G.LOGGING.debug("match result: %s", match_result)
    if not match_result:
        return None
    focus_pos = TargetPos().getXY(match_result, self.target_pos)
    return focus_pos
```

里面有个关键方法：match_result = self._cv_match(screen)

```python
@logwrap
def _cv_match(self, screen):
    # in case image file not exist in current directory:
    ori_image = self._imread()
    image = self._resize_image(ori_image, screen, ST.RESIZE_METHOD)
    ret = None
    for method in ST.CVSTRATEGY:
        # get function definition and execute:
        func = MATCHING_METHODS.get(method, None)
        if func is None:
            raise InvalidMatchingMethodError("Undefined method in CVSTRATEGY: '%s', try 'kaze'/'brisk'/'akaze'/'orb'/'surf'/'sift'/'brief' instead." % method)
        else:
            if method in ["mstpl", "gmstpl"]:
                ret = self._try_match(func, ori_image, screen, threshold=self.threshold, rgb=self.rgb, record_pos=self.record_pos,
                                        resolution=self.resolution, scale_max=self.scale_max, scale_step=self.scale_step)
            else:
                ret = self._try_match(func, image, screen, threshold=self.threshold, rgb=self.rgb)
        if ret:
            break
    return ret
```

首先读取图片调整图片尺寸，从而提升匹配成功率：
image = self._resize_image(ori_image, screen, ST.RESIZE_METHOD)
接下来是循环遍历匹配方法for method in ST.CVSTRATEGY。而ST.CVSTRATEGY的枚举值：

```python
CVSTRATEGY = ["mstpl", "tpl", "surf", "brisk"]
if LooseVersion(cv2.__version__) > LooseVersion('3.4.2'):
    CVSTRATEGY = ["mstpl", "tpl", "sift", "brisk"]
```

func = MATCHING_METHODS.get(method, None)，func可能的取值有mstpl、tpl、surf、shift、brisk，无论哪种模式都调到了共同的方法_try_math

```python
if method in ["mstpl", "gmstpl"]:
    ret = self._try_match(func, ori_image, screen, threshold=self.threshold, rgb=self.rgb, record_pos=self.record_pos,
                            resolution=self.resolution, scale_max=self.scale_max, scale_step=self.scale_step)
else:
    ret = self._try_match(func, image, screen, threshold=self.threshold, rgb=self.rgb)
```

而_try_math方法中都是调用的func的方法find_best_result()

```python
@staticmethod
def _try_match(func, *args, **kwargs):
    G.LOGGING.debug("try match with %s" % func.__name__)
    try:
        ret = func(*args, **kwargs).find_best_result()
    except aircv.NoModuleError as err:
        G.LOGGING.warning("'surf'/'sift'/'brief' is in opencv-contrib module. You can use 'tpl'/'kaze'/'brisk'/'akaze'/'orb' in CVSTRATEGY, or reinstall opencv with the contrib module.")
        return None
    except aircv.BaseError as err:
        G.LOGGING.debug(repr(err))
        return None
    else:
        return ret
```

以TemplateMatching类的find_best_result()为例，看一下内部逻辑如何实现。

```python
@print_run_time
def find_best_result(self):
    """基于kaze进行图像识别，只筛选出最优区域."""
    """函数功能：找到最优结果."""
    # 第一步：校验图像输入
    check_source_larger_than_search(self.im_source, self.im_search)
    # 第二步：计算模板匹配的结果矩阵res
    res = self._get_template_result_matrix()
    # 第三步：依次获取匹配结果
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
    h, w = self.im_search.shape[:2]
    # 求取可信度:
    confidence = self._get_confidence_from_matrix(max_loc, max_val, w, h)
    # 求取识别位置: 目标中心 + 目标区域:
    middle_point, rectangle = self._get_target_rectangle(max_loc, w, h)
    best_match = generate_result(middle_point, rectangle, confidence)
    LOGGING.debug("[%s] threshold=%s, result=%s" % (self.METHOD_NAME, self.threshold, best_match))

return best_match if confidence >= self.threshold else None
```

重点看第二步：计算模板匹配的结果矩阵res，res = self._get_template_result_matrix()

```python
def _get_template_result_matrix(self):
    """求取模板匹配的结果矩阵."""
    # 灰度识别: cv2.matchTemplate( )只能处理灰度图片参数
    s_gray, i_gray = img_mat_rgb_2_gray(self.im_search), img_mat_rgb_2_gray(self.im_source)
    return cv2.matchTemplate(i_gray, s_gray, cv2.TM_CCOEFF_NORMED)
```

可以看到最终用的是openCV的方法，cv2.matchTemplate，那个优先匹配上就返回结果。

### 4 总结

对不能用UI控件定位的部件，使用图像识别来定位还是非常方便的。Airtest的缺点有2个，一是对于背景透明的按钮或者控件，识别难度大；二是无法获取文本内容，但这一缺点可通过引入文字识别库解决，如：pytesseract。
在UI自动化脚本编写过程中可以将几个框架结合使用，uiautomator定位速度较快，但对于flutter语言写的页面经常有一些部件无法定位，此时可以引入airtest框架用图片进行定位。

------------
###### 自猿其说Tech-JDL京东物流技术与数据智能部
###### 作者：范文君