How to Install TensorRT?

RobertJasiek · #1

How to Install TensorRT?

The quick & dirty way of unpacking LizzieYZY and copying files from LizzieYZY\katago_tensorRT worked. However, I also want to use the official way by downloading from Nvidia. By getting relatively newer libraries, I hope for faster speed. Besides, we cannot rely on an unofficial installation forever because we do not know for how long LizzieYZY will be available and come with TensorRT library files. The official way should always work.

After a day of successfully installing Nvidia's CUDA and CuDNN, I do not succeed with Nvidia's TensorRT after another half a day. How to install Nvidia's TensorRT for Windows so that it works?

This is a quick summary of successfully installing Nvidia's CUDA and CuDNN:

Now to the problem of then installing Nvidia's TensorRT for Windows. Information is partially contradictory or outdated, and contains advanced stuff we probably do not need. This makes an already very difficult installation even more difficult.

Simply installing the most recent Nvidia versions of CUDA, CuDNN and TensorRT fails. They rely on CUDA 12 while KataGo still relies on CUDA 11. I have tried CUDA 12 and renamed two files but, although KataGo starts, one such library calls the other such library and expects the new file name. Therefore, instead we must use some archived Nvidia files for CUDA 11.

The KataGo download file names suggest that we need CUDA 11.2.2, CuDNN 8.5 for CUDA 11.2 and TensorRT 8.5 for CUDA 11.2. The closest download file names are cuda_11.2.2_461.33_win10.exe, cudnn-windows-x86_64-8.5.0.96_cuda11-archive.zip and TensorRT-8.5.3.1.Windows10.x86_64.cuda-11.8.cudnn8.6.zip, where confusingly the file name 8.5.3 denotes version 8.5 update 2. I have tried them but the TensorRT part fails.

However, one of Nvidia's installation manuals suggestions that this TensorRT version presumes CuDNN 8.6 instead of CuDNN 8.5. Needless to say, I have also tried that in vain.

This made me curios and I have studied the 11 quick & dirty working files from LizzieYZY:

Code:

cublas64_11.dll         11.6.1 file version 6.14.11
cublasLt64_11.dll      11.6.1 file version 6.14.11
cudart64_110.dll                11.4   file version 6.14.11
cudnn_cnn_infer64_8.dll         11.4   file version 6.14.11
cudnn_ops_infer64_8.dll      11.4   file version 6.14.11
cudnn64_8.dll         6.5.0  file version 6.14.11
msvcr110.dll                    \Windows\System32 version 11.0.51106.1
nvinfer.dll         ?
nvinfer_builder_resource.dll    ?
nvrtc64_112_0.dll               11.4   file version 6.14.11
nvrtc-builtins64_114.dll        ?

Three of these files do not declare any versions. The other files mention the versions 11.6.1, 11.4, 6.14.11 and 6.5.0. Now, I have to guess: which of these versions refers to either CUDA, CuDNN or TensorRT? Surely, I cannot do 81 trial & error installations and deinstallations of all three packages! Which versions of CUDA, CuDNN or TensorRT for Windows should we download so that they will work together?

For CUDA and CuDNN to work and in the Windows system environment variables, I have had to set Path to Nvidia's \lib subdirectory. Additionally, I have set Path to the subdirectories \include and \bin in vain. Must there be yet another path to \lib\x64?

Then I have noticed that Nvidia's installation directory misses msvcr110.dll. The same file is, however in the Windows system directory C:\Windows\System32 and I have copied it to C:\katago_TensorRT. The Nvidia installer may have thought: already installed so no need to duplicate it. Fine. It should work being in at least one of the two directories.

For KataGo CUDA, I noticed that the Nvidia installer also did not duplicate zlibwapi.dll, which is in an arcane place on my PC so I have copied it to C:\katago_TensorRT. I do not know if it is needed for TensorRT. Anayway, after putting it there, this also cannot be the problem.

Then Nvidia's installation manuals talk about advanced stuff: Visual Studio Solution, Python, TensorFlow, PyTorch, the uff, graphsurgeon, onnx_graphsurgeon, Python wheel packages, and a build sample. I think that none of these are needed because the 11 quick & dirty LizzieYZY library files for TensorRT together with KataGo's unpacked libraries work by themselves without any need of the advanced stuff. Am I right or must all that also be installed properly?

How to get TensorRT working in Windows from Nvidia's own download files?

EDIT: My basic installation of the TensorRT part has been:

kvasir · #2

I installed CUDA 11.6.2 more than a year ago and still using that with KataGo but when updating recently I updated CUDNN to 8.9.1.23 and TensorRT to 8.5.3.1.

That is I got "katago-v1.13.1-trt8.5-cuda11.2-windows-x64.zip" from github to work with the following:

CUDA 11.6.2
CUDNN 8.9.1.23
TensorRT 8.5.3.1

I did not need zlibwapi.dll with KataGo 1.13.1 but previously I simply copied and renamed ziplib.dll, knowing that this "wapi" stuff is a frequent misnaming that occurs with DLLs (but I have forgot the details).

I think TensorRT isn't supposed to require CUDNN anymore but I'm not sure in which version that changes.

I had to download TensoRT three times to get it to work when I upgraded, I saw that github had a comment about 8.5.3.1 being needed or working for some version and that turned out to work for me. The other TensorRT versions failed somewhere in the C runtime.

So make note of the fact that you need closely matching versions to get it to work. It can fail in unexpected ways if the versions don't align tightly with the environment the KataGo.exe was built. This is a difficulty with CUDA in general, it is not unexpected at all.

All of this Nvida stuff has documentations about what has been tested together, it just gets muddled quickly in actual use. It is possible to download same versions of the libraries that are built for use with different versions of other components and most combinations receive ZERO testing (even if the version numbers are close).

It is possible the CUDA 11.6.2 and TensorRT 8.5.3.1 combo works with KataGo 1.13.1 and CUDA 11.8.? and TensorRT 8.6.? combo also works with KataGo 1.13.1 while most other blends won't. The KataGo ZIP filename suggests to use it with CUDA 11.2 but the TensorRT zip file name to use with 11.8, obviously both have additional documentation somewhere (that I must have scanned when I was getting this to work).

Hope you get this to work. It is difficult to help over forums but you are making more progress in installing this than many professionals ever make :tmbup:

BTW since you got KataGo to work with CUDA it is highly likely that getting it to work with TensorRT it is only a matter of selecting the most appropriate TensorRT version. You also need to make sure that these libraries are referred to in the PATH variable (i.e. replacing what you previously installed) or, I guess, copy the files to the katago directory.

pwaldron · #3

NVidia has its compatibility matrix for compatible versions of CUDA, CuDNN and TensorRT at https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html. It references the archives for older versions.

Their matrix implies that TensorRT 8.5 is compatible with CUDA 10.2 and 11.0-11.8 and CuDNN 8.6, but it sounds like you've already tried it.

I am wondering if you are running into the same problem I had at work trying to get legacy ML code to work on a more recent GPU. The problem traced to the computing hardware being more recent than what the CUDA libraries could handle. In the end I had to find not just a more recent version of the library I needed, but one that had been specifically compiled against a more recent version of CUDA.

You may actually find it quicker to put a second drive into your computer, install Linux and build katago from scratch there. You'll likely be able to find more support if you're desperate for TensorRT.

thirdfogie · #4

I have some doubts about the following well-intentioned advice.

Quote:

You may actually find it quicker to put a second drive into your computer, install Linux and build katago from scratch there. You'll likely be able to find more support if you're desperate for TensorRT.

It feels timely to say that Linux is no panacea, though I will persist with it. I recently tried to upgrade from Debian 11 to 12 (bullseye to bookworm), but it did not go well. This contrasts with the upgrade from Debian 10 to 11, which was painless.

The first attempt used the "upgrade in place" method, but the nvidia drivers failed to install and I could not easily see how to fix it. So I reverted to the more familiar "full reinstallation" method. I could not immediately find a pre-built ISO file for xfce with amd64, but managed to get one using ktorrent. The resulting download passed its SHASUM test, and the SHASUM file had a good GPG signature from Debian, so the ISO file may be legitimate.

However, the resulting installation failed. The OpenCL drivers for nvidia work but katago will not run. The "ln -sf" trick used to overcome library-version incompatibilities with libzip, libcrypto and libssl no longer works. There may be some internal version number in the libraries which is now checked and causes failure. In addition, there are obscure errors when dpkg tries to update initramfs: it complains that some raspberry-pi firmware is not available, which is weird because my architecture is 64-bit Intel. Internet research says that this problem can be triggered by the multi-architecture nature of the libraries supplied by nvidia. It is my practice to configure and compile the latest kernel from source after the initial installation, so any failure to recreate initramfs is a show-stopper.

The problems may in part be caused by Debian's decision to split the "non-free" element in /etc/apt/sources.list into "non-free-firmware and "non-free", which I do not fully understand. This is all getting too complicated as the tendrils of dementia slowly tighten their grip, so it's back to bullseye for me. I may try again in 3 months time when bookworm has had more time to mature. Repeated experience after other failed experiments and cock-ups says that a full reinstallation and reconfiguration of bullseye can be done in about 5 hours, whereas fixing the problems with the new installation has no time limit.

Also, the recommended way to install katrain under linux is

Code:

pip3 install -U katrain

This worked with Debian 11, but Debian 12 spits out a complex error message about the dangers
of installing python code system-wide. Since I am not a python expert and I had already decided
to abandon Debian 12, I did not investigate further.

Another person might avoid my mistakes or decide to install Ubuntu or Red Hat and never see the problems I ran into, but there is no way to know that in advance.

If there is a theme to all this, it may be that work done to improve the integrity and security of Linux is making things more complex for idiots. I still have lots to learn.

** EDITED 16 June ** Edited for clarity and to add paragraph on katrain.

RobertJasiek · #5

kvasir wrote:

That is I got "katago-v1.13.1-trt8.5-cuda11.2-windows-x64.zip" from github to work with the following:

CUDA 11.6.2
CUDNN 8.9.1.23
TensorRT 8.5.3.1

The combination of KataGo CUDA 1_13_0, CUDA 11.6.2, CUDNN 8.9.1.23 solves my problem of previously slow CUDA, many thanks! However, the combination of these CUDA and CUDNN versions with KataGo TensorRT 1_13_1 and TensorRT 8.5.3.1 fails. KataGo starts but cannot proceed during benchmark or gtpconfig.

My intermediate solution is:

KataGo CUDA 1_13_0
CUDA 11.6.2
Path C:\Program Files\CUDA\bin
CUDNN 8.9.1.23
KataGo TensorRT 1_13_1
replace nvinfer.dll, nvinfer_builder_resource.dll by those of LizzieYZY|TensorRT

I will try to find a Nvidia TensorRT version for download that also works. Since now I know that only its files need to be replaced, the installation changes can be quick. Downloads and tests will be slow again.

And · #6

RobertJasiek wrote:

...
Downloads and tests will be slow again.

katago_tensorRT from the new version of Lizzieyzy loads as fast as katago_opencl, after the first launch

RobertJasiek · #7

Good, but my tests include running benchmark and gtpconfig while monitoring processes and GPU load, and managing LOGs and installation files. To know whether a TensorRT version works well, I also need to know some visits/s. Checking if it runs at all is insufficient; from CUDA tests, I know that different Nvidia file versions can make 3x - 6x speed difference. All this testing takes time - roughly 45 minutes per TensorRT version.

RobertJasiek · #8

And wrote:

katago_tensorRT from the new version of Lizzieyzy loads as fast as katago_opencl, after the first launch

Do you use version 2_5_2? Do you also use the GUI of LizzieYZY?

On the second launch or later of either KaTrain or Lizzie (without YZY), I get about these launch delays:

23s KaTrain, LizzieYZY_2_5_2_TensorRT files
20s Lizzie, LizzieYZY_2_5_2_TensorRT files

24s KaTrain, Nvidia TensorRT_8_5_2_2 files
22s Lizzie, Nvidia TensorRT_8_5_2_2 files

RobertJasiek · #9

I have found a working TensorRT from Nvidia. The newest working version for CUDA 11 is

TensorRT-8.5.2.2.Windows10.x86_64.cuda-11.8.cudnn8.6.zip

So the installation summary becomes:

KataGo CUDA 1_13_0
CUDA 11.6.2
Path C:\Program Files\CUDA\bin
CUDNN 8.9.1.23
KataGo TensorRT 1_13_1
TensorRT 8.5.2

Actually, the only needed files from TensorRT seem to be nvinfer.dll, nvinfer_builder_resource.dll. Now, I am experimenting brutally: I keep only \bin and have deleted the other directories and a JSON file from \CUDA. Presumably, even more files can be deleted:)

The KataGo TensorRT speeds of the files from Lizzie_2_5_2 and TensorRT 8.5.2 are within margin of error. The latter are 0 ~ 3% slower for different numbers of threads but this may be caused by the test sample.

And · **#10**

RobertJasiek wrote:

And wrote:

katago_tensorRT from the new version of Lizzieyzy loads as fast as katago_opencl, after the first launch

Do you use version 2_5_2? Do you also use the GUI of LizzieYZY?

On the second launch or later of either KaTrain or Lizzie (without YZY), I get about these launch delays:

23s KaTrain, LizzieYZY_2_5_2_TensorRT files
20s Lizzie, LizzieYZY_2_5_2_TensorRT files

24s KaTrain, Nvidia TensorRT_8_5_2_2 files
22s Lizzie, Nvidia TensorRT_8_5_2_2 files

Lizzieyzy version 2_5_3, run in sabaki. on my graphics card (GeForce GTX 1650), the startup time has been reduced from ~52 seconds to ~9 seconds

And · **#11**

8 sec:

2023-06-15 14:50:47+0400: GTP Engine starting...
2023-06-15 14:50:47+0400: KataGo v1.13.2
2023-06-15 14:50:47+0400: Using TrompTaylor rules initially, unless GTP/GUI overrides this
2023-06-15 14:50:47+0400: Using 1 CPU thread(s) for search
2023-06-15 14:50:47+0400: nnRandSeed0 = 2996345512003287467
2023-06-15 14:50:47+0400: After dedups: nnModelFile0 = E:\katago_tensorRT/default_model.bin.gz useFP16 auto useNHWC auto
2023-06-15 14:50:47+0400: Initializing neural net buffer to be size 19 * 19 exactly
2023-06-15 14:50:49+0400: TensorRT backend thread 0: Found GPU NVIDIA GeForce GTX 1650 memory 4294967296 compute capability major 7 minor 5
2023-06-15 14:50:49+0400: TensorRT backend thread 0: Initializing (may take a long time)
2023-06-15 14:50:55+0400: Using existing plan cache at E:\katago_tensorRT/KataGoData/trtcache/trt-8502_gpu-a6c244b9_net-kata1-b18c384nbt-softplusfixv13-s5971481344-d3261785976_3_exact19x19_batch8_fp16
2023-06-15 14:50:55+0400: TensorRT backend thread 0: Model version 13 useFP16 = true
2023-06-15 14:50:55+0400: TensorRT backend thread 0: Model name: kata1-b18c384nbt-softplusfixv13-s5971481344-d3261785976
2023-06-15 14:50:55+0400: Loaded neural net with nnXLen 19 nnYLen 19
2023-06-15 14:50:55+0400: Initializing board with boardXSize 19 boardYSize 19
2023-06-15 14:50:55+0400: Loaded config E:\katago_tensorRT/default_gtp.cfg
2023-06-15 14:50:55+0400: Loaded model E:\katago_tensorRT/default_model.bin.gz
2023-06-15 14:50:55+0400: Model name: kata1-b18c384nbt-softplusfixv13-s5971481344-d3261785976
2023-06-15 14:50:55+0400: GTP ready, beginning main protocol loop

RobertJasiek · **#12**

I have checked checksums. Nvidia TensorRT 8.5.2.2, LizzieYZY 2_5_2 and 2_5_3 have the same two TensorRT files

nvinfer.dll

nvinfer_builder_resource.dll

How to Install TensorRT?

Who is online