Hi,
Probably qpu_metric is the most important metric you need to monitor to see if you are reaching the limits of the SKU. You can create metric alert and set threshold to be a little bit below your maximum QPU for your SKU.
Please "Accept the answer" if the information helped you. This will help us and others in the community as well.