之前看mediasoup源码的时候对它内部的进程通信有点疑惑,这里记录一下.

mediasoup中的进程通信

mediasoup是个webrtc sfu, 它提供nodejs api用于构造webrtc server.

它的底层网络库是libuv,整个核心worker构建于libuv上,nodejs层通过进程间通信与c++构建的worker进程进行交互.

c++ worker源码中与node层建立通信的代码使用了libuv这两个接口

1
2
3
4
5
handles/UnixStreamSocket.cpp:   this->uvHandle       = new uv_pipe_t;
handles/UnixStreamSocket.cpp:   err = uv_pipe_init(DepLibUV::GetLoop(), this->uvHandle, 0);
handles/UnixStreamSocket.cpp:           MS_THROW_ERROR_STD("uv_pipe_init() failed: %s", uv_strerror(err));
handles/UnixStreamSocket.cpp:   err = uv_pipe_open(this->uvHandle, fd);
handles/UnixStreamSocket.cpp:           MS_THROW_ERROR_STD("uv_pipe_open() failed: %s", uv_strerror(err));

其中uv_pipe_open(this->uvHandle, fd) 中的参数fd是在main.cpp中定义的

1
static constexpr int ChannelFd{ 3 }; 

fd竟然是定义好的3, 这是一个文件句柄.乍一看就感觉是接着stdin,stdout,stderr这三个排下来的.但是这里的通信原理是什么,还要继续看下去.

从libuv接口来看,我一开始认为应当是使用了linux下的pipe相关调用,毕竟名字叫做uv_pipe嘛.

但是linux下pipe函数创建的通道是半双工,mediasoup中的使用看起来是全双工的用法.并且分配的描述符也是通过接口调用来的,这里直接就定义了描述符的值,实在是对应不起来.这个问题困惑了一下.

找来uv_pipe相关文档一探究竟

1
2
3
4
int uv_pipe_open(uv_pipe_t* handle, uv_file file)
Open an existing file descriptor or HANDLE as a pipe.

Changed in version 1.2.1: the file descriptor is set to non-blocking mode.

文档上写是打开一个已经存在的文件描述符,但是没有更多信息了.

再次查看nodejs层的代码

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
this._child = spawn(
	// command
	workerBin,
	// args
	workerArgs,
	// options
	{
		env :
		{
			MEDIASOUP_VERSION : version
		},

		detached : false,

		/*
		 * fd 0 (stdin)   : Just ignore it.
		 * fd 1 (stdout)  : Pipe it for 3rd libraries that log their own stuff.
		 * fd 2 (stderr)  : Same as stdout.
		 * fd 3 (channel) : Channel fd.
		 */
		stdio : [ 'ignore', 'pipe', 'pipe', 'pipe' ]
	});

可以看到node层将c++层的worker作为子进程启动了,然后使用this._child.stdio[index]作为双工流的描述符进行通信,这里隐藏了底层通信的细节.

父子进程全双工通信原理

nodejs底层网络也是libuv驱动的,这个时候要想知道nodejs父子进程通信的底层细节,只能查看源码了吗…

当然还有方法,那就是strace.先弄个简单的程序:

1
2
3
4
5
6
7
8
9
cat xx.js:

const { spawn } = require('child_process');
const ls = spawn('sleep', 
        ['1s'],
        {
            stdio: ['pipe', 'pipe', 'pipe', 'pipe', 'pipe', 'pipe']
        }
        );

使用strace -f node xx.js,底层的系统调用一目了然

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [11, 12]) = 0
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [13, 14]) = 0
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [15, 16]) = 0
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [17, 18]) = 0
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [19, 20]) = 0
[pid 15782] socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [21, 22]) = 0

[pid 15788] dup2(12, 0)                 = 0
[pid 15788] fcntl(0, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 15788] close(11)                   = 0
[pid 15788] dup2(14, 1)                 = 1
[pid 15788] fcntl(1, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 15788] close(13)                   = 0
[pid 15788] dup2(16, 2)                 = 2
[pid 15788] fcntl(2, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 15788] close(15)                   = 0
[pid 15788] dup2(18, 3)                 = 3
[pid 15788] close(17)                   = 0
[pid 15788] dup2(20, 4)                 = 4
[pid 15788] close(19)                   = 0
[pid 15788] dup2(22, 5)                 = 5
[pid 15788] close(21)                   = 0
[pid 15788] close(12)                   = 0
[pid 15788] close(14)                   = 0
[pid 15788] close(16)                   = 0
[pid 15788] close(18)                   = 0
[pid 15788] close(20)                   = 0
[pid 15788] close(22)                   = 0
[pid 15782] close(13)                   = 0
[pid 15782] close(15)                   = 0
[pid 15782] close(21)                   = 0
[pid 15782] close(19)                   = 0
[pid 15782] close(17)                   = 0
[pid 15782] close(11)                   = 0

以其中一个socketpair调用说明:

  • 首先是父进程15782调用socketpair,最终生成了11, 12这两个描述符
  • 子进程继承了这两个描述符11 12
  • 子进程调用dup2(12,0)将描述符12复制到描述符0
  • 子进程关闭12
  • 父进程关闭11
  • 父子进程通过全双工描述符通信

可以看到nodejs的spawn调用和libuv的uv_pipe能进行父子进程全双工通信的原理是调用了socketpair,这是一个典型的全双工父子通信过程,这就是底层实现的部分细节了.


update:

本文基于mediasoup3早期版本,如今mediasoup(2021)已经将nodejs层和c++层通信读写分离,每个读写通道采用不同的描述符。

翻看libuv文档,文档说明已经更新:

1
The uv_pipe_t structure represents more than just pipe(7) (or |), but supports any streaming file-like objects. On Windows, the only object of that description is the Named Pipe. On Unix, this could be any of Unix Domain Socket, or derived from mkfifo(1), or it could actually be a pipe(7). When uv_spawn initializes a uv_pipe_t due to the UV_CREATE_PIPE flag, it opts for creating a socketpair(2).

这里指出了使用UV_CREATE_PIPE参数创建父子进程通信其实就是使用了socketpair.