让TensorFlow在Macbook M1上性能翻倍

业无高卑志当坚,男儿有求安得闲?这篇文章主要讲述让TensorFlow在Macbook M1上性能翻倍相关的知识,希望能为你提供帮助。
手头有台MacBook M1笔记本,大部分应用都不兼容,VMware Fusion不支持Linux虚拟机。Parallel据说支持arm版的Windows和Linux,但是好像也不好用。唯一还有点用的地方就是做机器学习,目前tensorflow2.5原生支持M1,性能相比于2.4有较大提升,但是必须得用MacOS 12,还处于beta阶段。本文记录了在M1上配置tensorflow环境的过程,并且做了一些简单测试,从测试结果来看,性能提升还是比较明显的。
升级MacOS 12目前苹果为适配M1开发的tensorflow版本已经不用了,tensorflow2.5原生支持M1,所以第一步是升级MacOS12,可以参考下面的教程。
https://zhuanlan.zhihu.com/p/378946858
配置Conda环境因为Anaconda还不支持m1处理器,自带的python也是3.8的,不能原生支持arm处理器,所以需要使用开源的miniforge代替,它带了python3.9。
以下摘自miniforge的github的主页。
Miniforge3Latest installers with Python 3.9 (*) in the base environment:

OS Architecture Download
Linux x86_64 (amd64) Miniforge3-Linux-x86_64
Linux aarch64 (arm64) (**) Miniforge3-Linux-aarch64
Linux ppc64le (POWER8/9) Miniforge3-Linux-ppc64le
OS X x86_64 Miniforge3-MacOSX-x86_64
OS X arm64 (Apple Silicon) (***) Miniforge3-MacOSX-arm64
Windows x86_64 Miniforge3-Windows-x86_64
(*) The Python version is specific only to the base environment. Conda can create new environments with different Python versions and implementations.
(**) While the Raspberry PI includes a 64 bit processor, the RasbianOS is built on a 32 bit kernel and is not a supported configuration for these installers. We recommend using a 64 bit linux distribution such as Ubuntu for Raspberry PI.
(***) Apple silicon builds are experimental and haven\'t had testing like the other platforms.
虽然conda对m1对支持还处于experimental阶段,但是python3.9是原生支持m1处理器的,我们只是用conda管理python的包。
在安装过程中,可能是因为之前安装了anaconda,遇到了conda被zsh kill的问题,试了好多方法,包括装了完整的xcode,都没解决问题,后来换了个安装路径解决了。理论上不需要安装xcode,直接安装miniforge就行。
https://github.com/conda-forge/miniforge/issues/190
安装很简单,只要下载了安装程序,直接执行即可。
./Miniforge3-MacOSX-arm64.sh

一路yes或者默认即可,安完之后重启终端,看看conda和python能否运行,我的运行结果是python3.9.6。
(base)~ % python Python 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:35:11) [Clang 11.1.0 ] on darwin Type "help", "copyright", "credits" or "license" for more information. > > >

修改成国内仓库,打开或者创建~/.condarc,然后添加如下内容:
channels: - https://mirrors.ustc.edu.cn/anaconda/pkgs/main/ - https://mirrors.ustc.edu.cn/anaconda/cloud/conda-forge/ - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ - defaults show_channel_urls: true

安装一个包看是否用了国内源,可以看到,已经用了国内源
(base) niuxinli@niuxinlideMacBook-Pro ~ % conda install pandas Collecting package metadata (current_repodata.json): done Solving environment: done## Package Plan ##environment location: /Users/niuxinli/miniforge3added / updated specs: - pandasThe following packages will be downloaded:package|build ---------------------------|----------------- bottleneck-1.3.2|py39heec5a64_196 KBhttps://mirrors.ustc.edu.cn/anaconda/pkgs/main ca-certificates-2021.7.5|hca03da5_1113 KB

安装PyCharmPyCharm支持M1处理器,下载PyCharm社区版即可。
让TensorFlow在Macbook M1上性能翻倍

文章图片

给pycharm创建一个环境
让TensorFlow在Macbook M1上性能翻倍

文章图片

安装TensorFlow安装依赖
conda activate pycharm conda install -c apple tensorflow-deps

用pip安装tensorflow
pip默认源太慢,临时用阿里的源
python -m pip install tensorflow-macos -i https://mirrors.aliyun.com/pypi/simple/

安装metal plugin
python -m pip install tensorflow-metal -i https://mirrors.aliyun.com/pypi/simple/

安装一些其他依赖
brew install libjpeg pip install tensorflow-datasets -i https://mirrors.aliyun.com/pypi/simple/ conda install -y pandas matplotlib scikit-learn jupyterlab

安装完后,import numpy报错,
Original error was: dlopen(/Users/niuxinli/miniforge3/envs/pycharm/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so, 0x0002): Library not loaded: @rpath/libcblas.3.dylib

查了一下,随便用安装opencv看看能解决吗,确实把import的报错解决了,不过有个错误,说tensorflow2.5与numpy1.21.2不兼容,先不管。
pip install opencv-python -i https://mirrors.aliyun.com/pypi/simple/

以下为安装时的报错
ERROR: pip\'s dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow-macos 2.5.0 requires numpy~=1.19.2, but you have numpy 1.21.2 which is incompatible.

从下面的运行来看这个报错没有影响tensorflow正常工作。
测试TensorFlow为了对比m1下tensorflow的性能,我在网上找了一个博主写的对比结果和代码,链接如下:
https://zhuanlan.zhihu.com/p/350955566
他还是在mac os 11下安装的,理论上性能不如上面的安装方法。代码我稍微调整了一下兼容性相关的东西,其他的都不变。
import tensorflow as tf import tensorflow_datasets as tfds import time from datetime import timedelta from tensorflow.python.framework.ops import disable_eager_execution disable_eager_execution()(ds_train, ds_test), ds_info = tfds.load( \'mnist\', split=[\'train\', \'test\'], shuffle_files=True, as_supervised=True, with_info=True, )def normalize_img(image, label): return tf.cast(image, tf.float32) / 255., labelbatch_size = 128ds_train = ds_train.map( normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE) ds_train = ds_train.cache() ds_train = ds_train.shuffle(ds_info.splits[\'train\'].num_examples) ds_train = ds_train.batch(batch_size) ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)ds_test = ds_test.map( normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE) ds_test = ds_test.batch(batch_size) ds_test = ds_test.cache() ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation=\'relu\'), tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation=\'relu\'), tf.keras.layers.MaxPooling2D(pool_size=(2, 2)), tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation=\'relu\'), tf.keras.layers.Dense(10, activation=\'softmax\') ])model.compile( loss=\'sparse_categorical_crossentropy\', optimizer=tf.keras.optimizers.Adam(0.001), metrics=[\'accuracy\'], )start = time.time()model.fit( ds_train, epochs=10, # validation_steps=1, # steps_per_epoch=469, # validation_data=https://www.songbingjia.com/android/ds_test # 此处如果按原脚本添加这行,脚本无法运行,暂时未有解决方法 )delta = (time.time() - start) elapsed = str(timedelta(seconds=delta)) print(/'Elapsed Time: {}\'.format(elapsed))

运行的时候可以看到,GPU使用率接近100%
让TensorFlow在Macbook M1上性能翻倍

文章图片

运行时间几乎稳定在1分32秒,比博主3分20秒的成绩提高了一半,接近Colab GPU。
让TensorFlow在Macbook M1上性能翻倍

文章图片

【让TensorFlow在Macbook M1上性能翻倍】因此,在m1上安装macos 12以及tensorflow 2.5, 性能比之前接近翻倍。

    推荐阅读