Skip to main content

3 posts tagged with "container"

View All Tags

Implementing Graceful Shutdown in Windows Container

· 5 min read

Kubernetes Linux Pod 中,当通过 kubectl 删除一个 Pod 或 rolling update 一个 Pod 时, 每 Terminating 的 Pod 中的每个 Container 中 PID 为 1 的进程会收到 SIGTERM 信号, 通知进程进行资源回收并准备退出. 如果在 Pod spec.terminationGracePeriodSeconds 指定的时间周期内进程没有退出, 则 Kubernetes 接着会发出 SIGKILL 信号 KILL 这个进程。

通过 kubectl delete --force --grace-period=0 ... 的效果等同于直接发 SIGKILL 信号.

但 SIGTERM 和 SIGKILL 方式在 Windows Container 中并不工作, 目前 Windows Container 的表现是接收到 Terminating 指令 5 秒后直接终止。。。

参见:https://v1-18.docs.kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#v1-pod

  • V1.Pod.terminationGracePeriodSeconds - this is not fully implemented in Docker on Windows, see: reference. The behavior today is that the ENTRYPOINT process is sent CTRL_SHUTDOWN_EVENT, then Windows waits 5 seconds by default, and finally shuts down all processes using the normal Windows shutdown behavior. The 5 second default is actually in the Windows registry inside the container, so it can be overridden when the container is built.

基于社区的讨论结果及多次尝试, 目前 Windows 容器中行之有效的 Graceful Shutdown 方法是:

1. Build docker image 时通过修改注册表延长等待时间

<span id="LC5" class="line" lang="docker"><span class="k">...
RUN </span>reg add hklm<span class="se">\s</span>ystem<span class="se">\c</span>urrentcontrolset<span class="se">\s</span>ervices<span class="se">\c</span>execsvc /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 300 <span class="o">&&</span> <span class="se">\</span></span>
<span id="LC6" class="line" lang="docker"> reg add hklm<span class="se">\s</span>ystem<span class="se">\c</span>urrentcontrolset<span class="se">\c</span>ontrol /v WaitToKillServiceTimeout /t REG_SZ /d 300000 /f
</span>...

上面两个注册表位置, 第 1 个单位为秒, 第 2 个为毫秒

2. 在应用程序中注册 kernel32.dll 中的 SetConsoleCtrlHandler 函数捕获 CTRL_SHUTDOWN_EVENT 事件, 进行资源回收

以一个.net framework 的 Console App 为例说明用法:

using System;
using System.Runtime.InteropServices;
using System.Threading;

namespace Q1.Foundation.SocketServer
{
class Program
{
internal delegate bool HandlerRoutine(CtrlType CtrlType);
private static HandlerRoutine ctrlTypeHandlerRoutine = new HandlerRoutine(ConsoleCtrlHandler);

private static bool cancelled = false;
private static bool cleanupCompleted = false;

internal enum CtrlType
{
CTRL_C_EVENT = 0,
CTRL_BREAK_EVENT = 1,
CTRL_CLOSE_EVENT = 2,
CTRL_LOGOFF_EVENT = 5,
CTRL_SHUTDOWN_EVENT = 6
}

[DllImport("Kernel32")]
internal static extern bool SetConsoleCtrlHandler(HandlerRoutine handler, bool add);

static void Main()
{
var result = SetConsoleCtrlHandler(handlerRoutine, true);

// INITIAL AND START APP HERE

while (true)
{
if (cancelled) break;
}

// DO CLEANUP HERE
...
cleanupCompleted = true;
}

private static bool ConsoleCtrlHandler(CtrlType type)
{
cancelled = true;

while (!cleanupCompleted)
{
// Watting for clean-up to be completed...
}

return true;
}
}
}

代码解释:

  • 引入 Kernel32 并声明 extern 函数 SetConsoleCtrlHandler
  • 创建 static 的 HandlerRoutine.
  • 调用 SetConsoleCtrlHandler 注册处理函数进行事件捕获
  • 捕获后在 HandlerRoutine 应用程序中进行资源清理
  • 清理完成后在 HandlerRoutine 中返回 true 允许应用程序退出

上述两个步骤即完成了 Graceful Shutdown.

需要注意的点是:

1. 传统.net Console App 中的事件捕获( 比如: Console.CancelKeyPress, SystemEvents.SessionEnding )在容器中都不会生效,AppDomain.CurrentDomain.ProcessExit的触发时间又太晚, 只有 SetConsoleCtrlHandler 可行. 更多的尝试代码请参见: https://github.com/moby/moby/issues/25982#issuecomment-250490552

2. 要防止程序退出前 HandlerRoutine 实例被回收, 所以上面示例中使用了 static 的 HandlerRoutine. 这点很重要, 如果 HandlerRoutine 在应用程序未结束的时候被回收掉, 就会引发错误, 看如下代码:

<span id="LC36" class="line" lang="csharp"><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span></span>
<span id="LC37" class="line" lang="csharp"><span class="p">{
// Initialize here

...
using
{
var sysEventHandler = new HandlerRoutine(type =>
{
cancelled = true;

while (!cleanCompleted)
{
// Watting for clean-up to be completed...
}

return true;
});

var sysEventSetResult = SetConsoleCtrlHandler(sysEventHandler, true);
...
}
...

// Cleanup here
</span></span>}

在应用程序退出前, HandlerRoutine 实例已经被回收掉了,在 CTRL_SHUTDOWN_EVENT 被触发时就会引发 NullReferenceException, 具体错误信息如下:

Managed Debugging Assistant 'CallbackOnCollectedDelegate':
A callback was made on a garbage collected delegate of type 'Program+HandlerRoutine::Invoke'. This may cause application crashes, corruption and data loss. When passing delegates to unmanaged code, they must be kept alive by the managed application until it is guaranteed that they will never be called.

类似场景: CallbackOnCollectedDelegate was detected

关于 SetConsoleCtrlHandler 的使用参考:

SetConsoleCtrlHandler function

HandlerRoutine callback function

最后, 如果要处理的应用程序类型不是 Console App, 而是图形化的界面应用,则要处理的消息应该是WM_QUERYENDSESSION, 参见文档:

https://docs.microsoft.com/en-us/windows/console/setconsolectrlhandler#remarks

WM_QUERYENDSESSION message

Add File Extension to Windows IIS Container during image build

· 2 min read

Let's say: we need to add json file extension to the containerized IIS.

Dockerfile:

FROM {imageRegistry}/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
COPY . /inetpub/wwwroot
WORKDIR /inetpub/wwwroot

RUN C:\windows\system32\inetsrv\appcmd.exe set config "Default Web Site" -section:system.webServer/security/requestFiltering /+"fileExtensions.[fileExtension='json',allowed='True']"

ENV ASPNETCORE_URLS http://+:80
EXPOSE 80/tcp

An error occurs during build docker image:

Step 1/6 : FROM repo.q1lan.k8s:9999/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
---> a5bc996f06b3
Step 2/6 : COPY . /inetpub/wwwroot
---> bdb9536e506a
Step 3/6 : WORKDIR /inetpub/wwwroot
---> Running in f7666a9ffd0b
Removing intermediate container f7666a9ffd0b
---> c9fe76854f6c
Step 4/6 : RUN C:\windows\system32\inetsrv\appcmd.exe set config "Default Web Site" -section:system.webServer/security/requestFiltering /+"fileExtensions.[fileExtension='json',allowed='True']"
---> Running in 1c74d16420c2
Failed to process input: The parameter 'Web' must begin with a / or - (HRESULT=80070057).

Try to escape all double-quotes in Dockerfile:

RUN C:\windows\system32\inetsrv\appcmd.exe set config \"Default Web Site\" -section:system.webServer/security/requestFiltering /+\"fileExtensions.[fileExtension='json',allowed='True']\"

It works like a charm:

Step 1/6 : FROM repo.q1lan.k8s:9999/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
---> a5bc996f06b3
Step 2/6 : COPY . /inetpub/wwwroot
---> 646bbf3d5def
Step 3/6 : WORKDIR /inetpub/wwwroot
---> Running in 584471c0524a
Removing intermediate container 584471c0524a
---> 54f6a3ade821
Step 4/6 : RUN C:\windows\system32\inetsrv\appcmd.exe set config \"Default Web Site\" -section:system.webServer/security/requestFiltering /+\"fileExtensions.[fileExtension='json',allowed='True']\"
---> Running in f84c38da656a
Applied configuration changes to section "system.webServer/security/requestFiltering" for "MACHINE/WEBROOT/APPHOST/Default Web Site" at configuration commit path "MACHINE/WEBROOT/APPHOST/Default Web Site"
Removing intermediate container f84c38da656a
---> 7dfffe2d9813
Step 5/6 : ENV ASPNETCORE_URLS http://+:80
---> Running in dff81c8282f1
Removing intermediate container dff81c8282f1
---> cbd697556dd7
Step 6/6 : EXPOSE 80/tcp
---> Running in d10903bec188
Removing intermediate container d10903bec188
...
ClustrMaps