243 lines
11 KiB
Text
243 lines
11 KiB
Text
|
PM Quality Of Service Interface.
|
||
|
|
||
|
This interface provides a kernel and user mode interface for registering
|
||
|
performance expectations by drivers, subsystems and user space applications on
|
||
|
one of the parameters.
|
||
|
|
||
|
Two different PM QoS frameworks are available:
|
||
|
1. PM QoS classes for cpu_dma_latency, network_latency, network_throughput,
|
||
|
memory_bandwidth.
|
||
|
2. the per-device PM QoS framework provides the API to manage the per-device latency
|
||
|
constraints and PM QoS flags.
|
||
|
|
||
|
Each parameters have defined units:
|
||
|
* latency: usec
|
||
|
* timeout: usec
|
||
|
* throughput: kbs (kilo bit / sec)
|
||
|
* memory bandwidth: mbs (mega bit / sec)
|
||
|
|
||
|
|
||
|
1. PM QoS framework
|
||
|
|
||
|
The infrastructure exposes multiple misc device nodes one per implemented
|
||
|
parameter. The set of parameters implement is defined by pm_qos_power_init()
|
||
|
and pm_qos_params.h. This is done because having the available parameters
|
||
|
being runtime configurable or changeable from a driver was seen as too easy to
|
||
|
abuse.
|
||
|
|
||
|
For each parameter a list of performance requests is maintained along with
|
||
|
an aggregated target value. The aggregated target value is updated with
|
||
|
changes to the request list or elements of the list. Typically the
|
||
|
aggregated target value is simply the max or min of the request values held
|
||
|
in the parameter list elements.
|
||
|
Note: the aggregated target value is implemented as an atomic variable so that
|
||
|
reading the aggregated value does not require any locking mechanism.
|
||
|
|
||
|
|
||
|
From kernel mode the use of this interface is simple:
|
||
|
|
||
|
void pm_qos_add_request(handle, param_class, target_value):
|
||
|
Will insert an element into the list for that identified PM QoS class with the
|
||
|
target value. Upon change to this list the new target is recomputed and any
|
||
|
registered notifiers are called only if the target value is now different.
|
||
|
Clients of pm_qos need to save the returned handle for future use in other
|
||
|
pm_qos API functions.
|
||
|
|
||
|
The handle is a pm_qos_request object. By default the request object sets the
|
||
|
request type to PM_QOS_REQ_ALL_CORES, in which case, the PM QoS request
|
||
|
applies to all cores. However, the driver can also specify a request type to
|
||
|
be either of
|
||
|
PM_QOS_REQ_ALL_CORES,
|
||
|
PM_QOS_REQ_AFFINE_CORES,
|
||
|
PM_QOS_REQ_AFFINE_IRQ,
|
||
|
|
||
|
Specify the cpumask when type is set to PM_QOS_REQ_AFFINE_CORES and specify
|
||
|
the IRQ number with PM_QOS_REQ_AFFINE_IRQ.
|
||
|
|
||
|
void pm_qos_update_request(handle, new_target_value):
|
||
|
Will update the list element pointed to by the handle with the new target value
|
||
|
and recompute the new aggregated target, calling the notification tree if the
|
||
|
target is changed.
|
||
|
|
||
|
void pm_qos_remove_request(handle):
|
||
|
Will remove the element. After removal it will update the aggregate target and
|
||
|
call the notification tree if the target was changed as a result of removing
|
||
|
the request.
|
||
|
|
||
|
int pm_qos_request(param_class):
|
||
|
Returns the aggregated value for a given PM QoS class.
|
||
|
|
||
|
int pm_qos_request_for_cpu(param_class, cpu):
|
||
|
Returns the aggregated value for a given PM QoS class for the specified cpu.
|
||
|
|
||
|
int pm_qos_request_for_cpumask(param_class, cpumask):
|
||
|
Returns the aggregated value for a given PM QoS class for the specified
|
||
|
cpumask.
|
||
|
|
||
|
int pm_qos_request_active(handle):
|
||
|
Returns if the request is still active, i.e. it has not been removed from a
|
||
|
PM QoS class constraints list.
|
||
|
|
||
|
int pm_qos_add_notifier(param_class, notifier):
|
||
|
Adds a notification callback function to the PM QoS class. The callback is
|
||
|
called when the aggregated value for the PM QoS class is changed.
|
||
|
|
||
|
int pm_qos_remove_notifier(int param_class, notifier):
|
||
|
Removes the notification callback function for the PM QoS class.
|
||
|
|
||
|
|
||
|
From user mode:
|
||
|
Only processes can register a pm_qos request. To provide for automatic
|
||
|
cleanup of a process, the interface requires the process to register its
|
||
|
parameter requests in the following way:
|
||
|
|
||
|
To register the default pm_qos target for the specific parameter, the process
|
||
|
must open one of /dev/[cpu_dma_latency, network_latency, network_throughput]
|
||
|
|
||
|
As long as the device node is held open that process has a registered
|
||
|
request on the parameter.
|
||
|
|
||
|
To change the requested target value the process needs to write an s32 value to
|
||
|
the open device node. Alternatively the user mode program could write a hex
|
||
|
string for the value using 10 char long format e.g. "0x12345678". This
|
||
|
translates to a pm_qos_update_request call.
|
||
|
|
||
|
To remove the user mode request for a target value simply close the device
|
||
|
node.
|
||
|
|
||
|
|
||
|
2. PM QoS per-device latency and flags framework
|
||
|
|
||
|
For each device, there are three lists of PM QoS requests. Two of them are
|
||
|
maintained along with the aggregated targets of resume latency and active
|
||
|
state latency tolerance (in microseconds) and the third one is for PM QoS flags.
|
||
|
Values are updated in response to changes of the request list.
|
||
|
|
||
|
The target values of resume latency and active state latency tolerance are
|
||
|
simply the minimum of the request values held in the parameter list elements.
|
||
|
The PM QoS flags aggregate value is a gather (bitwise OR) of all list elements'
|
||
|
values. Two device PM QoS flags are defined currently: PM_QOS_FLAG_NO_POWER_OFF
|
||
|
and PM_QOS_FLAG_REMOTE_WAKEUP.
|
||
|
|
||
|
Note: The aggregated target values are implemented in such a way that reading
|
||
|
the aggregated value does not require any locking mechanism.
|
||
|
|
||
|
|
||
|
From kernel mode the use of this interface is the following:
|
||
|
|
||
|
int dev_pm_qos_add_request(device, handle, type, value):
|
||
|
Will insert an element into the list for that identified device with the
|
||
|
target value. Upon change to this list the new target is recomputed and any
|
||
|
registered notifiers are called only if the target value is now different.
|
||
|
Clients of dev_pm_qos need to save the handle for future use in other
|
||
|
dev_pm_qos API functions.
|
||
|
|
||
|
int dev_pm_qos_update_request(handle, new_value):
|
||
|
Will update the list element pointed to by the handle with the new target value
|
||
|
and recompute the new aggregated target, calling the notification trees if the
|
||
|
target is changed.
|
||
|
|
||
|
int dev_pm_qos_remove_request(handle):
|
||
|
Will remove the element. After removal it will update the aggregate target and
|
||
|
call the notification trees if the target was changed as a result of removing
|
||
|
the request.
|
||
|
|
||
|
s32 dev_pm_qos_read_value(device):
|
||
|
Returns the aggregated value for a given device's constraints list.
|
||
|
|
||
|
enum pm_qos_flags_status dev_pm_qos_flags(device, mask)
|
||
|
Check PM QoS flags of the given device against the given mask of flags.
|
||
|
The meaning of the return values is as follows:
|
||
|
PM_QOS_FLAGS_ALL: All flags from the mask are set
|
||
|
PM_QOS_FLAGS_SOME: Some flags from the mask are set
|
||
|
PM_QOS_FLAGS_NONE: No flags from the mask are set
|
||
|
PM_QOS_FLAGS_UNDEFINED: The device's PM QoS structure has not been
|
||
|
initialized or the list of requests is empty.
|
||
|
|
||
|
int dev_pm_qos_add_ancestor_request(dev, handle, type, value)
|
||
|
Add a PM QoS request for the first direct ancestor of the given device whose
|
||
|
power.ignore_children flag is unset (for DEV_PM_QOS_RESUME_LATENCY requests)
|
||
|
or whose power.set_latency_tolerance callback pointer is not NULL (for
|
||
|
DEV_PM_QOS_LATENCY_TOLERANCE requests).
|
||
|
|
||
|
int dev_pm_qos_expose_latency_limit(device, value)
|
||
|
Add a request to the device's PM QoS list of resume latency constraints and
|
||
|
create a sysfs attribute pm_qos_resume_latency_us under the device's power
|
||
|
directory allowing user space to manipulate that request.
|
||
|
|
||
|
void dev_pm_qos_hide_latency_limit(device)
|
||
|
Drop the request added by dev_pm_qos_expose_latency_limit() from the device's
|
||
|
PM QoS list of resume latency constraints and remove sysfs attribute
|
||
|
pm_qos_resume_latency_us from the device's power directory.
|
||
|
|
||
|
int dev_pm_qos_expose_flags(device, value)
|
||
|
Add a request to the device's PM QoS list of flags and create sysfs attributes
|
||
|
pm_qos_no_power_off and pm_qos_remote_wakeup under the device's power directory
|
||
|
allowing user space to change these flags' value.
|
||
|
|
||
|
void dev_pm_qos_hide_flags(device)
|
||
|
Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list
|
||
|
of flags and remove sysfs attributes pm_qos_no_power_off and pm_qos_remote_wakeup
|
||
|
under the device's power directory.
|
||
|
|
||
|
Notification mechanisms:
|
||
|
The per-device PM QoS framework has 2 different and distinct notification trees:
|
||
|
a per-device notification tree and a global notification tree.
|
||
|
|
||
|
int dev_pm_qos_add_notifier(device, notifier):
|
||
|
Adds a notification callback function for the device.
|
||
|
The callback is called when the aggregated value of the device constraints list
|
||
|
is changed (for resume latency device PM QoS only).
|
||
|
|
||
|
int dev_pm_qos_remove_notifier(device, notifier):
|
||
|
Removes the notification callback function for the device.
|
||
|
|
||
|
int dev_pm_qos_add_global_notifier(notifier):
|
||
|
Adds a notification callback function in the global notification tree of the
|
||
|
framework.
|
||
|
The callback is called when the aggregated value for any device is changed
|
||
|
(for resume latency device PM QoS only).
|
||
|
|
||
|
int dev_pm_qos_remove_global_notifier(notifier):
|
||
|
Removes the notification callback function from the global notification tree
|
||
|
of the framework.
|
||
|
|
||
|
|
||
|
Active state latency tolerance
|
||
|
|
||
|
This device PM QoS type is used to support systems in which hardware may switch
|
||
|
to energy-saving operation modes on the fly. In those systems, if the operation
|
||
|
mode chosen by the hardware attempts to save energy in an overly aggressive way,
|
||
|
it may cause excess latencies to be visible to software, causing it to miss
|
||
|
certain protocol requirements or target frame or sample rates etc.
|
||
|
|
||
|
If there is a latency tolerance control mechanism for a given device available
|
||
|
to software, the .set_latency_tolerance callback in that device's dev_pm_info
|
||
|
structure should be populated. The routine pointed to by it is should implement
|
||
|
whatever is necessary to transfer the effective requirement value to the
|
||
|
hardware.
|
||
|
|
||
|
Whenever the effective latency tolerance changes for the device, its
|
||
|
.set_latency_tolerance() callback will be executed and the effective value will
|
||
|
be passed to it. If that value is negative, which means that the list of
|
||
|
latency tolerance requirements for the device is empty, the callback is expected
|
||
|
to switch the underlying hardware latency tolerance control mechanism to an
|
||
|
autonomous mode if available. If that value is PM_QOS_LATENCY_ANY, in turn, and
|
||
|
the hardware supports a special "no requirement" setting, the callback is
|
||
|
expected to use it. That allows software to prevent the hardware from
|
||
|
automatically updating the device's latency tolerance in response to its power
|
||
|
state changes (e.g. during transitions from D3cold to D0), which generally may
|
||
|
be done in the autonomous latency tolerance control mode.
|
||
|
|
||
|
If .set_latency_tolerance() is present for the device, sysfs attribute
|
||
|
pm_qos_latency_tolerance_us will be present in the devivce's power directory.
|
||
|
Then, user space can use that attribute to specify its latency tolerance
|
||
|
requirement for the device, if any. Writing "any" to it means "no requirement,
|
||
|
but do not let the hardware control latency tolerance" and writing "auto" to it
|
||
|
allows the hardware to be switched to the autonomous mode if there are no other
|
||
|
requirements from the kernel side in the device's list.
|
||
|
|
||
|
Kernel code can use the functions described above along with the
|
||
|
DEV_PM_QOS_LATENCY_TOLERANCE device PM QoS type to add, remove and update
|
||
|
latency tolerance requirements for devices.
|