Flink推测机制

🕗 发布于 2024-07-10 06:00 flink 大数据

1、配置

execution.batch.speculative.enabled：false，推测机制开关，必须在AdaptiveBatchScheduler模式下使用

execution.batch.speculative.max-concurrent-executions：2，同时最多几次执行

execution.batch.speculative.block-slow-node-duration：1分钟，慢速节点会如黑名单，控制在黑名单中的时长

slow-task-detector.check-interval：1秒，慢任务检查间隔

slow-task-detector.execution-time.baseline-lower-bound：1分钟，慢任务检测基线的下限

slow-task-detector.execution-time.baseline-ratio：0.75，开始检测慢任务基线的任务完成率，即有75%任务完成后，开始计算剩下的任务是否为慢任务

slow-task-detector.execution-time.baseline-multiplier：1.5，慢任务基线乘数

2、SpeculativeScheduler

推测机制在AdaptiveBatchScheduler模式下使用，在AdaptiveBatchSchedulerFactory当中，创建调度器时，如果开启了推测机制，会创建SpeculativeScheduler

if (enableSpeculativeExecution) {
    return new SpeculativeScheduler(
            log,
            jobGraph,
            ioExecutor,
            jobMasterConfiguration,

2.1、启动

调度器启动时有三个操作：1、注册指标；2、父类通用的启动流程，会有算子的一些初始化；3、启动慢任务检测任务

protected void startSchedulingInternal() {
    registerMetrics(jobManagerJobMetricGroup);

    super.startSchedulingInternal();
    slowTaskDetector.start(getExecutionGraph(), this, getMainThreadExecutor());
}

2.2、SlowTaskDetector

SlowTaskDetector负责检测慢任务，实现类是ExecutionTimeBasedSlowTaskDetector，基于schedule进行检测

this.scheduledDetectionFuture =
        mainThreadExecutor.schedule(
                () -> {
                    listener.notifySlowTasks(findSlowTasks(executionGraph));
                    scheduleTask(executionGraph, listener, mainThreadExecutor);
                },
                checkIntervalMillis,
                TimeUnit.MILLISECONDS);

核心是findSlowTasks，首先是获取需要校验的拓扑集

private List<ExecutionJobVertex> getJobVerticesToCheck(final ExecutionGraph executionGraph) {
    return IterableUtils.toStream(executionGraph.getVerticesTopologically())
            .filter(ExecutionJobVertex::isInitialized)
            .filter(ejv -> ejv.getAggregateState() != ExecutionState.FINISHED)
            .filter(ejv -> getFinishedRatio(ejv) >= baselineRatio)
            .collect(Collectors.toList());
}

getFinishedRatio就是获取完成任务数超过基线比率的，就是拓扑集中完成任务数和总任务数的比值

private double getFinishedRatio(final ExecutionJobVertex executionJobVertex) {
    checkState(executionJobVertex.getTaskVertices().length > 0);
    long finishedCount =
            Arrays.stream(executionJobVertex.getTaskVertices())
                    .filter(ev -> ev.getExecutionState() == ExecutionState.FINISHED)
                    .count();
    return (double) finishedCount / executionJobVertex.getTaskVertices().length;
}

接下来是获取基线和在基线基础上计算慢速任务的，接口是getBaseline和findExecutionsExceedingBaseline，本质就是执行时间和基线的对比，注意这里不仅用到了时间，还用到了输入字节数，所以慢任务的检测可能是基于吞吐来的

private ExecutionTimeWithInputBytes getBaseline(
        final ExecutionJobVertex executionJobVertex, final long currentTimeMillis) {
    final ExecutionTimeWithInputBytes weightedExecutionTimeMedian =
            calculateFinishedTaskExecutionTimeMedian(executionJobVertex, currentTimeMillis);
    long multipliedBaseline =
            (long) (weightedExecutionTimeMedian.getExecutionTime() * baselineMultiplier);

    return new ExecutionTimeWithInputBytes(
            multipliedBaseline, weightedExecutionTimeMedian.getInputBytes());
}


return Double.compare(
        (double) executionTime / Math.max(inputBytes, Double.MIN_VALUE),
        (double) other.getExecutionTime()
                / Math.max(other.getInputBytes(), Double.MIN_VALUE));

2.3、notifySlowTasks

获取慢速任务以后，SlowTaskDetector会触发监听器，监听器的处理实现在SpeculativeScheduler的notifySlowTasks接口

首先把节点加入黑名单

// add slow nodes to blocklist before scheduling new speculative executions
blockSlowNodes(slowTasks, currentTimestamp);

这边会检测任务是否支持推测，默认是支持

if (!executionVertex.isSupportsConcurrentExecutionAttempts()) {
    continue;
}

基于时间戳，对慢任务新建Execution

final Collection<Execution> attempts =
        IntStream.range(0, newSpeculativeExecutionsToDeploy)
                .mapToObj(
                        i ->
                                executionVertex.createNewSpeculativeExecution(
                                        currentTimestamp))
                .collect(Collectors.toList());

之后会进行一系列的配置，加入监控

setupSubtaskGatewayForAttempts(executionVertex, attempts);
verticesToDeploy.add(executionVertexId);
newSpeculativeExecutions.addAll(attempts);

最后发起调度

executionDeployer.allocateSlotsAndDeploy(
        newSpeculativeExecutions,
        executionVertexVersioner.getExecutionVertexVersions(verticesToDeploy));

3、任务结束

任务结束主要核心在DefaultExecutionGraph的jobFinished，判断在上层ExecutionJobVertex.executionVertexFinished，这里是通过任务并行度来判断的，所有子任务完成则认为job完成

void executionVertexFinished() {
    checkState(isInitialized());
    numExecutionVertexFinished++;
    if (numExecutionVertexFinished == parallelismInfo.getParallelism()) {
        getGraph().jobVertexFinished();
    }
}

这个的调用是由Execution触发的，也就是每个子任务完成会去调用一次

if (transitionState(current, FINISHED)) {
    try {
        finishPartitionsAndUpdateConsumers();
        updateAccumulatorsAndMetrics(userAccumulators, metrics);
        releaseAssignedResource(null);
        vertex.getExecutionGraphAccessor().deregisterExecution(this);
    } finally {
        vertex.executionFinished(this);
    }
    return;
}

最终一个jobVertex（对应Job的一个任务，任务根据并行度有子任务）完成的时候会通知所有子任务完成

public void jobVertexFinished() {
    assertRunningInJobMasterMainThread();
    final int numFinished = ++numFinishedJobVertices;
    if (numFinished == numJobVerticesTotal) {
        FutureUtils.assertNoException(
                waitForAllExecutionsTermination().thenAccept(ignored -> jobFinished()));
    }
}

原文地址：https://blog.csdn.net/blackjjcat/article/details/138605327

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

OceanBase云数据库战略实施两年，受零售、支付、制造行业青睐
目前，OB Cloud正在全球范围提供一致的云数据库服务，覆盖美洲、欧洲、亚洲的30余个地理区域的100多个可用区，支持阿里云、亚马逊云科技、谷歌云、华为云、腾讯云等主流公有云基础设施，无缝适配云上如
阅读更多2024-09-27
Android——运行时动态申请权限
重写活动页面的权限请求回调方法onRequestPermissionsResult，在该方法内部处理用户的权限选择结果。调用ActivityCompat.requestPermissions方法，打开
阅读更多2024-09-27
边缘计算与 Python Web 应用：从理论到实践
通过在这些设备上部署轻量级的 Python Web 应用，可以实现边缘数据的处理和分析。边缘计算作为一种新兴的计算模型，意在将数据处理与分析的计算能力从数据中心迁移至数据生成的源头或其附近，简化了数据
阅读更多2024-09-27
TCP编程:从入门到实践
本文将从TCP编程的基本概念入手，逐步深入，带领大家掌握TCP编程的核心技术。本文从TCP编程的基本概念入手，通过一个简单的实例，讲解了如何在C语言中实现TCP编程。socket（套接字）是TCP编程
阅读更多2024-09-27
基于SSM+小程序的医院核酸检测服务管理系统（医院2）（源码+sql脚本+视频导入教程+文档）
基于SSM+小程序的医院核酸检测服务管理系统实现了管理员、用户管理、普通管理员、医护人员。1、管理员实现了首页、用户管理、医护人员管理、普通管理员、通知公告管理、疫苗接种管理、核算检测管理、接种订单管
阅读更多2024-09-27
【Wireshark笔记】通过Wireshark检测和分析TCP重传
TCP 是为保证数据可靠传输而设计的协议。它通过校验和、确认（ACK）、重传机制等确保数据包可以安全地传输到目标设备。当发送方没有在指定的超时时间内收到接收方的确认（ACK），它会认为该数据包可能丢失
阅读更多2024-09-27
如何创建ONLYOFFICE宏，缩放幻灯片中的所有图片
如果您是一名资深 Microsoft Excel 用户，那么相信您已对于 VBA 宏非常熟悉了。这些宏是帮助您自动执行日常任务的小型脚本。无论是重构数据，还是在单元格区域中插入多个值。ONLYOFFI
阅读更多2024-09-27
大语言模型之LlaMA系列-LlaMA 2及LlaMA_chat(下)
沿用了Llama 1的设计与架构：RoPE、RMSNorm、SwiGLU+AdamWLlama 2采用了Llama 1中的大部分预训练设置和模型架构，包括标准Transformer架构、使用RMSNo
阅读更多2024-09-27
数据集-目标检测系列-鲨鱼检测数据集 shark ＞＞ DataBall
数据集-目标检测系列-鲨鱼检测数据集 shark数据量：6k +想要进一步了解，请联系。DataBall 助力快速掌握数据集的信息和使用方式，百种数据集，持续增加中。
阅读更多2024-09-27
ABC372：K-th Largest Connected Components（并查集启发式合并）
1.合并集合的时候，没有用并查集合并两个集合。而是用vector来存每个点所连接的点，忽略了两个点代表的是两个集合，两个集合中不可能只有单单这两个点。本题就是建立集合，然后求集合中的第k大数。我们自然
阅读更多2024-09-27