PCI: dwc/qcom: Rework dw_pcie_wait_for_link() error handling and add D3cold support#644
PCI: dwc/qcom: Rework dw_pcie_wait_for_link() error handling and add D3cold support#644ziyuezhang-123 wants to merge 10 commits into
Conversation
…vice is not found The dw_pcie_wait_for_link() function waits up to 1 second for the PCIe link to come up and returns -ETIMEDOUT for all failures without distinguishing cases where no device is present on the bus. But the callers may want to just skip the failure if the device is not found on the bus and handle failure for other reasons. So after timeout, if the LTSSM is in Detect.Quiet or Detect.Active state, return -ENODEV to indicate the callers that the device is not found on the bus and return -ETIMEDOUT otherwise. Also add kernel doc to document the parameter and return values. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-1-2f32d5082549@oss.qualcomm.com Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
…e is not active
There are cases where the PCIe device would be physically connected to the
bus, but the device firmware might not be active. So the LTSSM will
get stuck in POLL.{Active/Compliance} states.
This behavior is common with endpoint devices controlled by the PCI
Endpoint framework, where the device will wait for the user to start its
operation through configfs.
For those cases, print the relevant log and return -EIO to indicate that
the device is present, but not active. This will allow the callers to skip
the failure as the device might become active in the future.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-2-2f32d5082549@oss.qualcomm.com
Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
…ignware.c Rename ltssm_status_string() to dw_pcie_ltssm_status_string() and move it to the common file pcie-designware.c so that this function could be used outside of pcie-designware-debugfs.c file. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-3-2f32d5082549@oss.qualcomm.com Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
For the cases where the link cannot come up later i.e., when LTSSM is not
in Detect.{Quiet/Active} or Poll.{Active/Compliance} states,
dw_pcie_wait_for_link() should log an error.
So promote dev_info() to dev_err(), reword the error log to make it clear
and also print the LTSSM state to aid debugging.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Tested-by: Richard Zhu <hongxing.zhu@nxp.com>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-4-2f32d5082549@oss.qualcomm.com
Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
…() returns -ETIMEDOUT The dw_pcie_wait_for_link() API now distinguishes link failures more precisely: -ENODEV: Device not found on the bus. -EIO: Device found but inactive. -ETIMEDOUT: Link failed to come up. Out of these three errors, only -ETIMEDOUT represents a definitive link failure since it signals that something is wrong with the link. For the other two errors, there is a possibility that the link might come up later. So fail dw_pcie_host_init() if -ETIMEDOUT is returned and skip the failure otherwise. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-5-2f32d5082549@oss.qualcomm.com Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
…d eligibility Add a common helper, pci_host_common_d3cold_possible(), to determine whether PCIe devices under host bridge can safely transition to D3cold. This helper is intended to be used by PCI host controller drivers to decide whether they may safely put the host bridge into D3cold based on the power state and wakeup capabilities of downstream endpoints. The helper walks all devices on the all bridge buses and only allows the devices to enter D3cold if all PCIe endpoints are already in PCI_D3hot. This ensures that we do not power off the host bridge while any active endpoint still requires the link to remain powered. For devices that may wake the system, the helper additionally requires that the device supports PME wake from D3cold (via WAKE#). Devices that do not have wakeup enabled are not restricted by this check and do not block the devices under host bridge from entering D3cold. Devices without a bound driver and with PCI not enabled via sysfs are treated as inactive and therefore do not prevent the devices under host bridge from entering D3cold. This allows controllers to power down more aggressively when there are no actively managed endpoints. Some devices (e.g. M.2 without auxiliary power) lose PME detection when main power is removed. Even if such devices advertise PME-from-D3cold capability, entering D3cold may break wakeup. So, return PME-from-D3cold capability via an output parameter so PCIe controller drivers can apply platform-specific handling to preserve wakeup functionality. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260429-d3cold-v5-1-89e9735b9df6@oss.qualcomm.com/ Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
For older targets like sc7280, we see reading DBI after sending PME turn off message is causing NOC error. To avoid unsafe DBI accesses, introduce qcom_pcie_get_ltssm() to retrieve the LTSSM state. For newer platforms, the LTSSM state is read from the PARF_LTSSM register, while older platforms continue to retrieve it from ELBI_SYS_STTS. This helper is used in place of direct DBI-based link state checks in the D3cold path after sending PME turn-off message, ensuring the LTSSM state can be queried safely even after DBI access is no longer valid. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260429-d3cold-v5-2-89e9735b9df6@oss.qualcomm.com/ Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
…g rails/clocks Some Qcom PCIe controller variants bring the PHY out of test power-down (PHY_TEST_PWR_DOWN) during init. When the link is later transitioned towards D3cold and the driver disables PCIe clocks and/or regulators without explicitly re-asserting PHY_TEST_PWR_DOWN, the PHY can remain partially powered, leading to avoidable power leakage. Update the init-path comments to reflect that PARF_PHY_CTRL is used to power the PHY on. Also, for controller revisions that enable PHY power in init (2.3.2, 2.3.3, 2.4.0, 2.7.0 and 2.9.0), explicitly power the PHY down via PARF_PHY_CTRL in the deinit path before disabling clocks or regulators. This ensures the PHY is put into a defined low-power state prior to removing its supplies, preventing leakage when entering D3cold. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260429-d3cold-v5-3-89e9735b9df6@oss.qualcomm.com/ Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
Previously, the driver skipped putting the link into L2/device state in D3cold whenever L1 ASPM was enabled, since some devices (e.g. NVMe) expect low resume latency and may not tolerate deeper power states. However, such devices typically remain in D0 and are already covered by the new helper's requirement that all endpoints be in D3hot before the devices under host bridge may enter D3cold. So, replace the local L1/L1SS-based check in dw_pcie_suspend_noirq() with the shared pci_host_common_d3cold_possible() helper to decide whether the devices under host bridge can safely transition to D3cold. In addition, propagate PME-from-D3cold capability information from the helper and record it in skip_pwrctrl_off. Some devices (e.g. M.2 cards without auxiliary power) may lose PME detection when main power is removed, even if they advertise PME-from-D3cold support. This allows controller power-off to be skipped when required to preserve wakeup functionality. Update the suspended flag in dw_pcie_resume_noirq() only after the PCIe link resumes successfully, to avoid marking the controller active when link resume fails. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260429-d3cold-v5-4-89e9735b9df6@oss.qualcomm.com/ Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
|
Merge Check Failed: No CR Numbers Found Error: No Change Request numbers were found. Please add Change Request numbers to your pull request description in the format CRs-Fixed: 12345 or link GitHub issues that are associated with Change Requests. |
|
Merge Check Failed: CR Not Eligible for Merge CR 4551808 is not eligible for merge. The parent software image for kernel.qli.2.0 is not development complete. Entity: Please ensure the CR passes both CCT (ComponentChangeTasks) and ICT (Integration Change Tasks) validations. |
dev complete now |
Add support for transitioning PCIe endpoints under host bridge into D3cold by integrating with the DWC core suspend/resume helpers. Implement PME_TurnOff message generation via ELBI_SYS_CTRL and hook it into the DWC host operations so the controller follows the standard PME_TurnOff-based power-down sequence before entering D3cold. When the device is suspended into D3cold, fully tear down interconnect bandwidth, OPP votes. If D3cold is not entered, retain existing behavior by keeping the required interconnect and OPP votes. Use dw_pcie::skip_pwrctrl_off to avoid powering off devices during suspend to preserve wakeup capability of the devices and also not to power on the devices in the init path. Drop the qcom_pcie::suspended flag and rely on the existing dw_pcie::suspended state, which now drives both the power-management flow and the interconnect/OPP handling. Link: https://lore.kernel.org/all/20260429-d3cold-v5-5-89e9735b9df6@oss.qualcomm.com/ Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
|
Merge Check Failed: CR Not Eligible for Merge CR 4551808 is not eligible for merge. The parent software image for kernel.qli.2.0 is not development complete. Entity: Please ensure the CR passes both CCT (ComponentChangeTasks) and ICT (Integration Change Tasks) validations. |
PR #644 — validate-patchPR: #644
Final Summary
|
PR #644 — checker-log-analyzerPR: #644
Detailed report: Full report
|
Port two upstream patch series to d3_cold_2:
Rework dw_pcie_wait_for_link() error handling (v4,
https://lore.kernel.org/all/20260120-pci-dwc-suspend-rework-v4-0-2f32d5082549@oss.qualcomm.com/):
distinguish -ENODEV/-EIO/-ETIMEDOUT for different link failure scenarios,
improve error logging with LTSSM state, and only fail dw_pcie_host_init()
on -ETIMEDOUT.
Add Qualcomm PCIe D3cold support (v5,
https://lore.kernel.org/all/20260429-d3cold-v5-0-89e9735b9df6@oss.qualcomm.com/):
add pci_host_common_d3cold_possible() eligibility helper, safe LTSSM
read via PARF register, PHY power-down in deinit path, and full
interconnect/OPP teardown on D3cold entry.
Two fixes applied during porting: correct ops->get_ltssm guard condition
in qcom_pcie_get_ltssm() (upstream rebase bug), and remove unused
err_deinit label in dw_pcie_resume_noirq() (fixes -Wunused-label warning).
CRs-Fixed: 4551808