资源传播失败:ClusterResourcePlacementWorkSynchronized 为 false

本文介绍如何排查 ClusterResourcePlacementWorkSynchronized 在 Azure Kubernetes Fleet Manager 中使用 ClusterResourcePlacement API 对象传播资源时出现的问题。

现象

使用 ClusterResourcePlacement Azure Kubernetes Fleet Manager 中的 API 对象传播资源时,如果更新了 ClusterResourcePlacement 资源,则关联的工作对象不会与更改同步, ClusterResourcePlacementWorkSynchronized 条件状态显示为 False

注意

若要获取有关工作对象同步失败的原因的详细信息,可以检查 工作生成器控制器 日志。

原因

此问题可能由于以下原因之一而发生:

  • 控制器在尝试生成相应的工作对象时遇到错误。
  • 信封对象的格式不正确。

案例研究

在以下示例中,尝试 ClusterResourcePlacement 将资源传播到所选群集,但工作对象不会更新以反映最新更改,因为所选群集已终止。

ClusterResourcePlacement 规范

spec:
  resourceSelectors:
    - group: rbac.authorization.k8s.io
      kind: ClusterRole
      name: secret-reader
      version: v1
  policy:
    placementType: PickN
    numberOfClusters: 1
  strategy:
    type: RollingUpdate

ClusterResourcePlacement 状态

spec:
  policy:
    numberOfClusters: 1
    placementType: PickN
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: test-ns
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
status:
  conditions:
  - lastTransitionTime: "2024-05-14T18:05:04Z"
    message: found all cluster needed as specified by the scheduling policy, found
      1 cluster(s)
    observedGeneration: 1
    reason: SchedulingPolicyFulfilled
    status: "True"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-14T18:05:05Z"
    message: All 1 cluster(s) start rolling out the latest resource
    observedGeneration: 1
    reason: RolloutStarted
    status: "True"
    type: ClusterResourcePlacementRolloutStarted
  - lastTransitionTime: "2024-05-14T18:05:05Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 1
    reason: NoOverrideSpecified
    status: "True"
    type: ClusterResourcePlacementOverridden
  - lastTransitionTime: "2024-05-14T18:05:05Z"
    message: There are 1 cluster(s) which have not finished creating or updating work(s)
      yet
    observedGeneration: 1
    reason: WorkNotSynchronizedYet
    status: "False"
    type: ClusterResourcePlacementWorkSynchronized
  observedResourceIndex: "0"
  placementStatuses:
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-14T18:05:04Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-14T18:05:05Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-14T18:05:05Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-14T18:05:05Z"
      message: 'Failed to synchronize the work to the latest: works.placement.kubernetes-fleet.io
        "crp1-work" is forbidden: unable to create new content in namespace fleet-member-kind-cluster-1
        because it is being terminated'
      observedGeneration: 1
      reason: SyncWorkFailed
      status: "False"
      type: WorkSynchronized
  selectedResources:
  - kind: Namespace
    name: test-ns
    version: v1

在状态中 ClusterResourcePlacementClusterResourcePlacementWorkSynchronized 条件状态显示为 False。 消息指示禁止工作对象 crp1-work 在命名空间 fleet-member-kind-cluster-1 中生成新内容,因为它当前正在终止。

解决方法

在这种情况下,下面是几个潜在的解决方案:

  • 使用 ClusterResourcePlacement 新选择的群集修改该群集。
  • ClusterResourcePlacement删除通过垃圾回收删除工作。
  • 重新加入成员群集。 只能在重新加入群集后重新生成命名空间。

在其他情况下,可以选择等待工作完成传播。

联系我们寻求帮助

如果你有任何疑问或需要帮助,请创建支持请求联系 Azure 社区支持。 你还可以将产品反馈提交到 Azure 反馈社区