Commit graph

288555 commits

Author SHA1 Message Date
Namhyung Kim
fde0eeaba7 perf report: Document --symbol-filter option
Add missing description of --symbol-filter in Documentation/perf-report.txt.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332125628-23088-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-19 12:13:56 -03:00
Namhyung Kim
f7e5410920 perf ui browser: Clean lines inside of the input window
As Arnaldo pointed out, it should be cleared to prevent the window from
displaying overlapped strings on the region.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332125180-23041-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-19 12:07:30 -03:00
Namhyung Kim
6db6127c4d perf report: Treat an argument as a symbol filter
As Ingo requested, it'd be better off treating first (and the only)
argument as a symbol filter, so that user doesn't need to input the
symbol on the dialog window on TUI.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-5-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:44:36 -03:00
Namhyung Kim
b14ffaca44 perf report: Add --symbol-filter option
Add new --symbol-filter command line option to set appropriate filter
string.

Its short version is missing as I couldn't find an ideal one and
--filter option of perf record also has no short version.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:41:30 -03:00
Namhyung Kim
938a23ae7f perf ui browser: Add 's' key to filter by symbol name
Now user can enter symbol name interested via ui_browser__input_window,
and perf can process it using hists__filter_by_symbol(). Giving empty
symbol (by pressing 's' followed by ENTER) will disable the filtering.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:34:13 -03:00
Namhyung Kim
aa49f6ec99 perf ui browser: Introduce ui_browser__input_window
The ui_browser__input_window() function is to get user's key input.
Current implementation can handle maximum 49 characters.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:32:36 -03:00
Namhyung Kim
e94d53ebec perf hists: Add hists__filter_by_symbol
This function will be used for simple (sub-)string matching filter based
on user input.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:31:09 -03:00
Namhyung Kim
5090c6aea8 perf tools: Do not disable members of group event
When event group is enabled for forked task (i.e. no target task/cpu
was specified) all events were disabled and marked ->enable_on_exec.
However they wouldn't be counted at all since only group leader will
be enabled on exec actually.

In contrast to perf stat, perf record doesn't have a real problem
as it enables all the event before proceeding. But it needs to be
fixed anyway IMHO.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887340-32448-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:29:33 -03:00
Namhyung Kim
4c19ea453d perf stat: Fix event grouping on forked task
When event group is enabled for forked task (i.e. no target task was
specified) all events were disabled and marked ->enable_on_exec.
However they are not counted at all since only group leader will be
enabled on exec actually. So the result looked like below:

 $ ./perf stat --group -- sleep 1

 Performance counter stats for 'sleep 1':

          0.554926 task-clock                #    0.001 CPUs utilized
     <not counted> context-switches
     <not counted> CPU-migrations
     <not counted> page-faults
     <not counted> cycles
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
     <not counted> instructions
     <not counted> branches
     <not counted> branch-misses

       1.001228093 seconds time elapsed

Fix it by disabling group leader only.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887340-32448-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:13:45 -03:00
Jiri Olsa
5f537a2659 perf tools: Add support to specify pmu style event
Added new event rule to the event definition grammar:

event_def: event_pmu |
           ...
event_pmu: PE_NAME '/' event_config '/'

Using this rule, event could be now specified like:
  cpu/config=1,config1=2,config2=3/u

where pmu name 'cpu' is looked up via following path:
  ${sysfs_mount}/bus/event_source/devices/${pmu}

and config options are bound to the pmu's format definiton:
  ${sysfs_mount}/bus/event_source/devices/${pmu}/format

The hardcoded config options still stays and have precedence
over any format field defined with same name.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-50d8nr94f8k4wkezutrxvthe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 14:30:13 -03:00
Jiri Olsa
cd82a32e99 perf tools: Add perf pmu object to access pmu format definition
Adding pmu object which provides interface to pmu's sysfs
event format definition located at:
  ${sysfs_mount}/bus/event_source/devices/${pmu}/format

Following interface is exported:
  struct perf_pmu* perf_pmu__find(char *name);
  - this function returns pmu object, which is then
    passed as a handle to other interface functions

  int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
                       struct list_head *head_terms);
  - this function configures perf_event_attr struct based
    on pmu's format definitions and config terms data,
    containined in head_terms list.

Parser generator is used to retrive the pmu's format definition.
The generated parser is part of the patch. Added makefile rule
'pmu-parser' to generate the parser code out of the bison/flex
sources.

Added builtin test 'Test perf pmu format parsing', which could
be run like:
	perf test pmu

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-errz96u1668gj9wlop1zhpht@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 14:29:35 -03:00
Jiri Olsa
8f707d843c perf tools: Add config options support for event parsing
Adding a new rule to the event grammar to be able to specify
values of additional attributes of symbolic event.

The new syntax for event symbolic definition is:

event_legacy_symbol:  PE_NAME_SYM '/' event_config '/' |
                      PE_NAME_SYM sep_slash_dc

event_config:         event_config ',' event_term | event_term

event_term:           PE_NAME '=' PE_NAME |
                      PE_NAME '=' PE_VALUE
                      PE_NAME

sep_slash_dc: '/' | ':' |

At the moment the config options are hardcoded to be used for legacy
symbol events to define several perf_event_attr fields. It is:

  'config'   to define perf_event_attr::config
  'config1'  to define perf_event_attr::config1
  'config2'  to define perf_event_attr::config2
  'period'   to define perf_event_attr::sample_period

Legacy events could be now specified as:
  cycles/period=100000/

If term is specified without the value assignment, then 1 is
assigned by default.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-mgkavww9790jbt2jdkooyv4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 14:26:06 -03:00
Jiri Olsa
89812fc81f perf tools: Add parser generator for events parsing
Changing event parsing to use flex/bison parse generator.
The event syntax stays as it was.

grammar description:

events: events ',' event | event

event:  event_def PE_MODIFIER_EVENT | event_def

event_def: event_legacy_symbol sep_dc     |
           event_legacy_cache sep_dc      |
           event_legacy_breakpoint sep_dc |
           event_legacy_tracepoint sep_dc |
           event_legacy_numeric sep_dc    |
           event_legacy_raw sep_dc

event_legacy_symbol:      PE_NAME_SYM

event_legacy_cache:       PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT |
                          PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT  |
                          PE_NAME_CACHE_TYPE

event_legacy_raw:         PE_SEP_RAW PE_VALUE

event_legacy_numeric:     PE_VALUE ':' PE_VALUE

event_legacy_breakpoint:  PE_SEP_BP ':' PE_VALUE ':' PE_MODIFIER_BP

event_breakpoint_type:    PE_MODIFIER_BPTYPE | empty

PE_NAME_SYM:              cpu-cycles|cycles                              |
                          stalled-cycles-frontend|idle-cycles-frontend   |
                          stalled-cycles-backend|idle-cycles-backend     |
                          instructions                                   |
                          cache-references                               |
                          cache-misses                                   |
                          branch-instructions|branches                   |
                          branch-misses                                  |
                          bus-cycles                                     |
                          cpu-clock                                      |
                          task-clock                                     |
                          page-faults|faults                             |
                          minor-faults                                   |
                          major-faults                                   |
                          context-switches|cs                            |
                          cpu-migrations|migrations                      |
                          alignment-faults                               |
                          emulation-faults

PE_NAME_CACHE_TYPE:       L1-dcache|l1-d|l1d|L1-data             |
                          L1-icache|l1-i|l1i|L1-instruction      |
                          LLC|L2                                 |
                          dTLB|d-tlb|Data-TLB                    |
                          iTLB|i-tlb|Instruction-TLB             |
                          branch|branches|bpu|btb|bpc            |
                          node

PE_NAME_CACHE_OP_RESULT:  load|loads|read                        |
                          store|stores|write                     |
                          prefetch|prefetches                    |
                          speculative-read|speculative-load      |
                          refs|Reference|ops|access              |
                          misses|miss

PE_MODIFIER_EVENT:        [ukhp]{0,5}

PE_MODIFIER_BP:           [rwx]

PE_SEP_BP:                'mem'

PE_SEP_RAW:               'r'

sep_dc:                   ':' |

Added flex/bison files for event grammar parsing. The generated
parser is part of the patch. Added makefile rule 'event-parser'
to generate the parser code out of the bison/flex sources.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-u4pfig5waq3ll2bfcdex8fgi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 14:20:21 -03:00
Jiri Olsa
641cc93881 perf: Adding sysfs group format attribute for pmu device
Adding sysfs group 'format' attribute for pmu device that
contains a syntax description on how to construct raw events.

The event configuration is described in following
struct pefr_event_attr attributes:

  config
  config1
  config2

Each sysfs attribute within the format attribute group,
describes mapping of name and bitfield definition within
one of above attributes.

eg:
  "/sys/...<dev>/format/event" contains "config:0-7"
  "/sys/...<dev>/format/umask" contains "config:8-15"
  "/sys/...<dev>/format/usr"   contains "config:16"

the attribute value syntax is:

  line:      config ':' bits
  config:    'config' | 'config1' | 'config2"
  bits:      bits ',' bit_term | bit_term
  bit_term:  VALUE '-' VALUE | VALUE

Adding format attribute definitions for x86 cpu pmus.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 14:06:06 -03:00
Jan Beulich
4a3d2d9bfb perf tools: Adjust make rules
Add rules to generate pre-processed files (just like are available for
the normal kernel build), and adjust the rule to create assembly files
from C ones to produce its output in the output directory rather than
in the source tree.

E.g.

  $ make -C tools/perf/ O=/tmp/perf /tmp/perf/builtin-top.i
  make: Entering directory `/home/git/linux/tools/perf'
      CC /tmp/perf/builtin-top.i
  make: Leaving directory `/home/git/linux/tools/perf'
  $ wc -l /tmp/perf/builtin-top.i
  31379 /tmp/perf/builtin-top.i
  $

Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F588A0802000078000770FF@nat28.tlf.novell.com
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-15 14:23:28 -03:00
Ingo Molnar
bea95c152d Merge branch 'perf/hw-branch-sampling' into perf/core
Merge reason: The 'perf record -b' hardware branch sampling feature is ready for upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:47:05 +01:00
Stephane Eranian
24bff2dc0f perf report: Fix annotate double quit issue in branch view mode
This patch fixes perf report to not go back two levels when
pressing the 'q' key while annotating in branch view mode.

When pressing 'q' in annotate mode and if the branch source
and target belong to different functions, perf now brings
up the annotation popup menu again to offer the option to
annotate the other branch source or target.

As part of the code restructuring in perf_evsel__hists_browse()
we also fix a memory leak on options[] in case of error.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331565210-10865-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:46:16 +01:00
Stephane Eranian
8bcd65fd29 perf report: Remove duplicate annotate choice in branch view mode
This patch removes the duplicated annotate selection when
browsing in branch view mode. If the sym and dso oof the branch
source and target are the same, then only one annotate choice is
proposed.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331565210-10865-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:46:16 +01:00
Peter Zijlstra
f9b4eeb809 perf/x86: Prettify pmu config literals
I got somewhat tired of having to decode hex numbers..

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/n/tip-0vsy1sgywc4uar3mu1szm0rg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:44:54 +01:00
Ingo Molnar
35239e23c6 Merge branch 'perf/urgent' into perf/core
Merge reason: We are going to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:44:11 +01:00
Peter Zijlstra
87e24f4b67 perf/x86: Fix local vs remote memory events for NHM/WSM
Verified using the below proglet.. before:

[root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0
remote write

 Performance counter stats for './numa 0':

         2,101,554 node-stores
         2,096,931 node-store-misses

       5.021546079 seconds time elapsed

[root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1
local write

 Performance counter stats for './numa 1':

           501,137 node-stores
               199 node-store-misses

       5.124451068 seconds time elapsed

After:

[root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0
remote write

 Performance counter stats for './numa 0':

         2,107,516 node-stores
         2,097,187 node-store-misses

       5.012755149 seconds time elapsed

[root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1
local write

 Performance counter stats for './numa 1':

         2,063,355 node-stores
               165 node-store-misses

       5.082091494 seconds time elapsed

#define _GNU_SOURCE

#include <sched.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <dirent.h>
#include <signal.h>
#include <unistd.h>
#include <numaif.h>
#include <stdlib.h>

#define SIZE (32*1024*1024)

volatile int done;

void sig_done(int sig)
{
	done = 1;
}

int main(int argc, char **argv)
{
	cpu_set_t *mask, *mask2;
	size_t size;
	int i, err, t;
	int nrcpus = 1024;
	char *mem;
	unsigned long nodemask = 0x01; /* node 0 */
	DIR *node;
	struct dirent *de;
	int read = 0;
	int local = 0;

	if (argc < 2) {
		printf("usage: %s [0-3]\n", argv[0]);
		printf("  bit0 - local/remote\n");
		printf("  bit1 - read/write\n");
		exit(0);
	}

	switch (atoi(argv[1])) {
	case 0:
		printf("remote write\n");
		break;
	case 1:
		printf("local write\n");
		local = 1;
		break;
	case 2:
		printf("remote read\n");
		read = 1;
		break;
	case 3:
		printf("local read\n");
		local = 1;
		read = 1;
		break;
	}

	mask = CPU_ALLOC(nrcpus);
	size = CPU_ALLOC_SIZE(nrcpus);
	CPU_ZERO_S(size, mask);

	node = opendir("/sys/devices/system/node/node0/");
	if (!node)
		perror("opendir");
	while ((de = readdir(node))) {
		int cpu;

		if (sscanf(de->d_name, "cpu%d", &cpu) == 1)
			CPU_SET_S(cpu, size, mask);
	}
	closedir(node);

	mask2 = CPU_ALLOC(nrcpus);
	CPU_ZERO_S(size, mask2);
	for (i = 0; i < size; i++)
		CPU_SET_S(i, size, mask2);
	CPU_XOR_S(size, mask2, mask2, mask); // invert

	if (!local)
		mask = mask2;

	err = sched_setaffinity(0, size, mask);
	if (err)
		perror("sched_setaffinity");

	mem = mmap(0, SIZE, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	err = mbind(mem, SIZE, MPOL_BIND, &nodemask, 8*sizeof(nodemask), MPOL_MF_MOVE);
	if (err)
		perror("mbind");

	signal(SIGALRM, sig_done);
	alarm(5);

	if (!read) {
		while (!done) {
			for (i = 0; i < SIZE; i++)
				mem[i] = 0x01;
		}
	} else {
		while (!done) {
			for (i = 0; i < SIZE; i++)
				t += *(volatile char *)(mem + i);
		}
	}

	return 0;
}

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/n/tip-tq73sxus35xmqpojf7ootxgs@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:43:41 +01:00
Linus Torvalds
fde7d9049e Linux 3.3-rc7 2012-03-10 13:49:52 -08:00
Al Viro
c7b2855505 aio: fix the "too late munmap()" race
Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 18:59:59 -08:00
Al Viro
86b62a2cb4 aio: fix io_setup/io_destroy race
Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 18:59:59 -08:00
Linus Torvalds
86e0600833 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "I have two additional and btrfs fixes in my for-linus branch.  One is
  a casting error that leads to memory corruption on i386 during scrub,
  and the other fixes a corner case in the backref walking code (also
  triggered by scrub)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix casting error in scrub reada code
  btrfs: fix locking issues in find_parent_nodes()
2012-03-09 18:09:18 -08:00
Hugh Dickins
be22aece68 memcg: revert fix to mapcount check for this release
Respectfully revert commit e6ca7b89dc "memcg: fix mapcount check
in move charge code for anonymous page" for the 3.3 release, so that
it behaves exactly like releases 2.6.35 through 3.2 in this respect.

Horiguchi-san's commit is correct in itself, 1 makes much more sense
than 2 in that check; but it does not go far enough - swapcount
should be considered too - if we really want such a check at all.

We appear to have reached agreement now, and expect that 3.4 will
remove the mapcount check, but had better not make 3.3 different.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 15:32:20 -08:00
Thomas Gleixner
a7f4255f90 x86: Derandom delay_tsc for 64 bit
Commit f0fbf0abc0 ("x86: integrate delay functions") converted
delay_tsc() into a random delay generator for 64 bit.  The reason is
that it merged the mostly identical versions of delay_32.c and
delay_64.c.  Though the subtle difference of the result was:

 static void delay_tsc(unsigned long loops)
 {
-	unsigned bclock, now;
+	unsigned long bclock, now;

Now the function uses rdtscl() which returns the lower 32bit of the
TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
bit this fails when the lower 32bit are close to wrap around when
bclock is read, because the following check

       if ((now - bclock) >= loops)
       	  	break;

evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
because the unsigned long (now - bclock) of these values results in
0xffffffff00000001 which is definitely larger than the loops
value. That explains Tvortkos observation:

"Because I am seeing udelay(500) (_occasionally_) being short, and
 that by delaying for some duration between 0us (yep) and 491us."

Make those variables explicitely u32 again, so this works for both 32
and 64 bit.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # >= 2.6.27
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 12:43:27 -08:00
Linus Torvalds
c447064de4 sound fixes for 3.3-rc7
Nothing exciting here: just a few regression fixes for HD-audio and ASoC,
 also the support of missing 32bit compat ioctl for HDSPM.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIcBAABAgAGBQJPWkMOAAoJEGwxgFQ9KSmk28sP/jGD+gCptQOFsA37fmkPKRNZ
 1ILs2iAPeEXL71R6gHUED/sarI4/wcUMQUMZkg4EZ81jiug7wWeT87VrjD6JiuEl
 PcYZU7PxWJgQ8xoPUEU9+HmjOnxrl1dRqrRYSEzApkCKq47inT9xPRQJek9LH9dD
 w8FCOf5n3i1Xbtb3TZsaUY2gKBF62x/xHA6ko8YXt531KLCbFskEMQ5uz442Aqa0
 HuhWMVgNJOxkGUkViE5hFz5dha8xi6RsdWW5wBCaRfMJoIE1LqP3p4L6YHfAYUN9
 aTgELyNMUgjZN77JgTyEn9VDc3fCJi9E4/bmTwJrPKpvkmZcUKkleUn8h7muS/9s
 8WI2rXFt+HJoOz+1xdfQ2o1J3uMcFePsIqWHiQlgjRLWn6I9E5hvWm36ExtgdaSr
 ySTCiLhGO6E1v27b9DYhU6gSxdjZgISNcRrEkGZKUcYf5iSOPkBJbhg3P4fz/Sas
 aCKGDgWVerWrGVVxsphJJ4uWxx9WZVLpKhGie8Zl9Umo4I6p33U+SX1kkRMJtKFl
 WMBph5qVOcVKDP8t2EPsK8/eZz7rMfNdAVT0YYUwB/iW/BPwb8M3KJt05Le8bhtc
 ma19c2MjEIHeeSsd6mVLW3NVrnSLNSrdiZLTzWnlg6zVbBcY9TAdkiUX1wZWERCN
 2NdPx0nP3Z1xbsVLorb9
 =aCRY
 -----END PGP SIGNATURE-----

Merge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Nothing exciting here: just a few regression fixes for HD-audio and
  ASoC, also the support of missing 32bit compat ioctl for HDSPM."

* tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hdspm - Provide ioctl_compat
  ALSA: hda/realtek - Apply the coef-setup only to ALC269VB
  ALSA: hda - add quirk to detect CD input on Gigabyte EP45-DS3
  ASoC: neo1973: fix neo1973 wm8753 initialization
2012-03-09 12:14:23 -08:00
David Brown
8cd5c8661d MAINTAINERS: new git entry for arm/mach-msm
The msm git tree moved to

  git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm.git

Signed-off-by: David Brown <davidb@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 12:13:36 -08:00
Linus Torvalds
0ab5d757db Fix for C6X KSTK_EIP and KSTK_ESP macros.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPV41FAAoJEOiN4VijXeFPsuEQAJDPafvpu9KEqSV/TOGKyelu
 rQk94tvHSVuGeAZiFpjjzQ/u0ofGNOxBqwbVnQjHwt6ameWX9VOxjbR4ZzdF5h9i
 1KbeapNIOKjebHUAf2Sh5mO8MkY8TcTFUzFBk5rOhP8BpNKfpFgfVFxh6JZVFo/b
 7a0kokAFoH24XQXAx4GK11uNE697fQqfcqyxqZ3lRhZ6jtEiIjMkUeRFgLy/G6gE
 Xo3H03ETawbJWzl4PXOiYNFH1Xf3DhT1VM7hLJVuPIrB3KT2oBweGQGn2DKextXe
 /4WE3XwGr9euHbJcJ7c5P0Gu2P76QDhLjgjz/hQfJdMfkD4ZsH0/VrLdTDs5AWOT
 WgzJg40WUDWgL7GuJgL+pCjt6pGszQq5EXGwakHesJU6zT4xZu9NyxChoTzyZdjW
 HIx3NJe5JaeCv0hKpmHlSaaJMXiYKWwKs1d3GOiAKP84AvQlqdsV3UyPCFRNCKNv
 cO2lkkC6pdHnGlJezmNxXqYZaaHwFsbTIcmOLiNlb8B2Q1uttsm3cbS0AReosdC8
 5PLGlvEbXfe/MURrquorpGMKEq5DjnbNdwHx5RcUh/882hZEieDbGVO6R6vApTy2
 dDR5vVhaOfcc86ST0WFLRrjUfqzPUGxBrMvGDn7pwKHlKC5h9IWOptTB/nSgJGuf
 7kCcp9qpXff8fvP2l4/A
 =2BoN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull C6X fix from Mark Salter:
 "Fix for C6X KSTK_EIP and KSTK_ESP macros."

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  C6X: fix KSTK_EIP and KSTK_ESP macros
2012-03-09 07:27:38 -08:00
Linus Torvalds
0cacaf51a0 IOMMU fixes for Linux v3.3-rc6
Two fixes are queued up. The first is an additional fix for the OMAP
 initialization order issue and the second patch fixes a possible section
 mismatch which can lead to a kernel crash in the AMD IOMMU driver when
 suspend/resume is used and the compiler has not inlined the
 iommu_set_device_table function.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPWgA/AAoJECvwRC2XARrj7voQAN6evWicqjRDiXQgEC3muQFU
 OrA/Jz/i7+pHEYXcsTt0xHLb8juZNJAdkJjToB+oz7i/7D+TYRmJe+QRNkVmw7Ld
 d3DbUSUi9B3agvGblosKV3DYM8By1vTn9Gy2GNatW1yPuo5o4FHK2ePC5sn8Z/8z
 qwTZZnmvqluz7frNiw6Y3bNOqLd46z+9thUOoKmRn/fo3vKCOOVvb85yu1m/uqy6
 Dmpn6ep0w53jK29ZTKWcL8PW0YrLTEfszhMcVshFT+Y7GVSGnSxwgSh1fnZm/WL6
 z11L57dI0+7RS/z+cw+ko7ymIloV2v4ABRArMPIoLgbIQT0lidDNSqOQnPvWaBek
 MwdLL8W64lt2h4T7bLhDNRSDggWCX+EJYlk87O4hJYt4n57c3yT55z2+BGdoFivZ
 tzPshNWN4KVDMZCU6sTvzvz6eErwvro5wlVM2WxDVfXTxn6UTblP5uIQDGb2zVA9
 G95kK/OlK/s+giwOSOKxtR62livKEAuJ2Croa5LsdJnLdCo6ipvIz9cuAG6eij2W
 tulJQUN3FZr288iUAOPQ9xj6hWYM9RXqoYBxAHAvgYgGuirj9E6hjk/fL0y6XGQk
 jcg0bCdJjh2RzsLZR0eeslybtlrvUsBWYCPYCAAmRjUYHwZ6s822aO7qh0VnrFuQ
 csH09+3B0twvJ+ZRJibN
 =PZDd
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull two IOMMU fixes from Joerg Roedel:
 "The first is an additional fix for the OMAP initialization order issue
  and the second patch fixes a possible section mismatch which can lead
  to a kernel crash in the AMD IOMMU driver when suspend/resume is used
  and the compiler has not inlined the iommu_set_device_table function."

* tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  x86/amd: iommu_set_device_table() must not be __init
  ARM: OMAP: fix iommu, not mailbox
2012-03-09 07:26:25 -08:00
Linus Torvalds
45b8da90f2 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull radeon drm stuff from Dave Airlie:
 "Just some radeon fixes, one is for an oops where we run out of ioremap
  space on some big hardware systems in 32-bit mode, stuff doesn't work
  properly but at least the machine will boot.

  One regression fix, and two bugs, one hw, one blit code."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms: fix hdmi duallink checks
  drm/radeon/kms: set SX_MISC in the r6xx blit code (v2)
  drm/radeon: deal with errors from framebuffer init path.
  drm/radeon: fix a semaphore deadlock on pre cayman asics
2012-03-09 07:23:17 -08:00
Linus Torvalds
e304dfdb03 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking from David Miller:

1) IPV4 routing metrics can become stale when routes are changed by the
   administrator, fix from Steffen Klassert.

2) atl1c does "val |= XXX;" where XXX is a bit number not a bit mask,
   fix by using set_bit.  From Dan Carpenter.

3) Memory accounting bug in carl9170 driver results in wedged TX queue.
   Fix from Nicolas Cavallari.

4) iwlwifi accidently uses "sizeof(ptr)" instead of "sizeof(*ptr)", fix
   from Johannes Berg.

5) Openvswitch doesn't honor dp_ifindex when doing vport lookups, fix
   from Ben Pfaff.

6) ehea conversion to 64-bit stats lost multicast and rx_errors
   accounting, fix from Eric Dumazet.

7) Bridge state transition logging in br_stp_disable_port() is busted,
   it's emitted at the wrong time and the message is in the wrong tense,
   fix from Paulius Zaleckas.

8) mlx4 device erroneously invokes the queue resize firmware operation
   twice, fix from Jack Morgenstein.

9) Fix deadlock in usbnet, need to drop lock when invoking usb_unlink_urb()
   otherwise we recurse into taking it again.  Fix from Sebastian Siewior.

10) hyperv network driver uses the wrong driver name string, fix from
    Haiyang Zhang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/hyperv: Use the built-in macro KBUILD_MODNAME for this driver
  net/usbnet: avoid recursive locking in usbnet_stop()
  route: Remove redirect_genid
  inetpeer: Invalidate the inetpeer tree along with the routing cache
  mlx4_core: fix bug in modify_cq wrapper for resize flow.
  atl1c: set ATL1C_WORK_EVENT_RESET bit correctly
  bridge: fix state reporting when port is disabled
  bridge: br_log_state() s/entering/entered/
  ehea: restore multicast and rx_errors fields
  openvswitch: Fix checksum update for actions on UDP packets.
  openvswitch: Honor dp_ifindex, when specified, for vport lookup by name.
  iwlwifi: fix wowlan suspend
  mwifiex: reset encryption mode flag before association
  carl9170: fix frame delivery if sta is in powersave mode
  carl9170: Fix memory accounting when sta is in power-save mode.
2012-03-09 07:14:44 -08:00
Stephane Eranian
a68c2c5817 perf report: Enable TUI in branch view mode
This patch updates perf report to support TUI mode
when the perf.data file contains samples with branch
stacks.

For each row in the report, it is possible to annotate
either the source or target of each branch.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:08 +01:00
Stephane Eranian
993ac88d58 perf report: Auto-detect branch stack sampling mode
This patch enhances perf report to auto-detect when the
perf.data file contains samples with branch stacks. That way it
is not necessary to use the -b option.

To force branch view mode to off, simply use --no-branch-stack.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:08 +01:00
Stephane Eranian
330aa675b4 perf record: Add HEADER_BRANCH_STACK tag
This patch adds a new feature bit, namely,
HEADER_BRANCH_STACK.  When present, it indicates
that sample records may contain branch stack.

This could be useful to a viewer to switch to
branch mode without having to parse all the
samples or without a specific cmdline option.

This will be used in a subsequent patch to
enhance perf report with branch stacks.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:08 +01:00
Stephane Eranian
a5aabdacde perf record: Provide default branch stack sampling mode option
This patch chanegs the logic of the -b, --branch-stack options
of perf record.

Based on users' request, the patch provides a default filter
mode with the -b (or --branch-any) option.  With the option,
any type of taken branches is sampled.

With -j (or --branch-filter), the user can specify any
valid combination of branch types and privilege levels
if supported by the underlying hardware.

The -b (--branch any) is a shortcut for: --branch-filter any.

 $ perf record -b foo

or:

 $ perf record --branch-filter any foo

For more specific filtering:

 $ perf record --branch-filter ind_call,u foo

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:07 +01:00
Stephane Eranian
114382a0ae perf tools: Make perf able to read files from older ABIs
This patches provides a way to handle legacy perf.data
files.  Legacy files are those using the older PERFFILE
signature.

For those, it is still necessary to detect endianness but
without comparing their header->attr_size with the
tool's own version as it may be different. Instead, we use
a reference table for all known sizes from the legacy era.

We try all the combinations for sizes and endianness. If we find
a match, we proceed, otherwise we return: "incompatible file
format".

This is also done for the pipe-mode file format.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-19-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:07 +01:00
Stephane Eranian
62db90681c perf tools: Fix ABI compatibility bug in print_event_desc()
This patches cleans up local variable types for msz and ret.
They need to be size_t and ssize_t respectively.

It also fixes a bug whereby perf would not read attr struct
with a different size than what it knows about.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-18-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:06 +01:00
Stephane Eranian
69996df486 perf tools: Enable reading of perf.data files from different ABI rev
This patch allows perf to process perf.data files generated
using an ABI that has a different perf_event_attr struct size,
i.e., a different ABI version.

The perf_event_attr can be extended, yet perf needs to cope with
older perf.data files. Similarly, perf must be able to cope with
a perf.data file which is using a newer version of the ABI than
what it knows about.

This patch adds read_attr(), a routine that reads a
perf_event_attr struct from a file incrementally based on its
advertised size. If the on-file struct is smaller than what perf
knows, then the extra fields are zeroed. If the on-file struct
is bigger, then perf only uses what it knows about, the rest is
skipped.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-17-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:06 +01:00
Stephane Eranian
cb5d769990 perf: Add ABI reference sizes
This patch adds reference sizes for revision 1
and 2 of the perf_event ABI, i.e., the size of
the perf_event_attr struct.

With Rev1: config2 was added = +8 bytes
With Rev2: branch_sample_type was added = +8 bytes

Adds the definition for Rev1, Rev2.

This is useful for tools trying to decode the revision
numbers based on the size of the struct.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-16-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:05 +01:00
Roberto Agostino Vitillo
b50311dc2a perf report: Add support for taken branch sampling
This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.

The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.

The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.

Here is a quick example.
Here branchy is simple test program which looks as follows:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
  if (n & 1UL)
    f2();
  else
    f3();
}
int main(void)
{
  unsigned long i;

  for (i=0; i < N; i++)
   f1(i);
  return 0;
}

Here is the output captured on Nehalem, if we are
only interested in user level function calls.

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
    52.34%  [.] main                   [.] f1
    24.04%  [.] f1                     [.] f3
    23.60%  [.] f1                     [.] f2
     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
     0.01%  [k] __printf               [k] _IO_vfprintf_internal
     0.01%  [k] main                   [k] __printf

About half (52%) of the call branches captured are from main()
-> f1(). The second half (24%+23%) is split in two equal shares
between f1() -> f2(), f1() ->f3(). The output is as expected
given the code.

It should be noted, that using -b in perf record does not
eliminate information in the perf.data file. Consequently, a
typical profile can also be obtained by perf report by simply
not using its -b option.

It is possible to sort on branch related columns:

   - dso_from, symbol_from
   - dso_to, symbol_to
   - mispredict

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:05 +01:00
Roberto Agostino Vitillo
bdfebd848f perf record: Add support for sampling taken branch
This patch adds a new option to enable taken branch stack
sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
of perf_events.

There is a new option to active this mode: -b.
It is possible to pass a set of filters to select the type of
branches to sample.

The following filters are available:

 - any : any type of branches
 - any_call : any function call or system call
 - any_ret : any function return or system call return
 - any_ind : any indirect branch
 - u:  only when the branch target is at the user level
 - k: only when the branch target is in the kernel
 - hv: only when the branch target is in the hypervisor

Filters can be combined by passing a comma separated list
to the option:

$ perf record -b any_call,u -e cycles:u branchy

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:05 +01:00
Roberto Agostino Vitillo
b5387528f3 perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
This patch adds:

 - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
 - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
 - build histograms on branches

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:04 +01:00
Linus Torvalds
9f8050c4f9 Last minute fixes for 3.3
One samsung build fix due to a mis-applied patch, and a small set of OMAP fixes.
 This should be the last from arm-soc for 3.3, hopefully.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPWQktAAoJEIwa5zzehBx3OK4P/i3+nTB2WW8i8ll8pTHybSAj
 YuQGOtsvJDCkZQ6tIgMahNH/Pr2Q2gFe6cjP/akbWWCfUvOl36ZAl9NYY8eexoco
 ZGkMEXfhgtAsVepzEF1/64lwOgrh+AikzlL0q0tA2FmClcAejwFE7ht6zPcwFgJj
 9lmR201/6OLc+qkmvrKRbmpO+eYZUVuQ4cOEMOxebXZN/tn5S/R2QWC3+PkUMbWz
 7bngLHjO97legI0SGzfaKOx/02y/4hIDSybIPDOKDuXi1jUJlkrmnP09sQQLYdKN
 uxu56mn+vUPCIGNwmE1/ufSneM8pZEAHaRkUjFOrxtsdGKVsxFz1vyJ3OAQHd8FU
 WU+fKOew/Jf+jRpApydKWJ0yOdyTZIVqncvWpFs1RlmAVKPE96M8IL9oUZaV8rgJ
 c9XUCvDFF0OuZQyMoacSzbVfcAy9SE/f7fNXLUpI9rEriYfY5waCXcQrYGwjRtmD
 j2iY0pyaU72i7mtTbcsInPZ/bZ7gO+D1MmFl9WzduLWDJoGHJVcUyC1+YrJjoGF3
 ggplns12qTtdgfiqTKV2I5qf2nHHbk8kmhII1fkYPZwdG2enKhV+r5Oth3zKPVXF
 oOD8YnjACuad05YdkpzQL2DSaiGtPVAXHP4xMN9ktNPSJHwx19HNDiTKex6oxcZf
 Dki+FinbRiFYUP/O91Xe
 =PLTM
 -----END PGP SIGNATURE-----

Merge tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull last minute fixes from Olof Johansson:
 "One samsung build fix due to a mis-applied patch, and a small set of
  OMAP fixes.  This should be the last from arm-soc for 3.3, hopefully."

* tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: S3C2440: Fixed build error for s3c244x
  ARM: OMAP2+: Fix module build errors with CONFIG_OMAP4_ERRATA_I688
  ARM: OMAP: id: Add missing break statement in omap3xxx_check_revision
  ARM: OMAP2+: Remove apply_uV constraints for fixed regulator
  ARM: OMAP: irqs: Fix NR_IRQS value to handle PRCM interrupts
2012-03-08 17:32:42 -08:00
Linus Torvalds
9a0cee7114 Another small, clear fix in a specific driver.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPV7ccAAoJEBus8iNuMP3dnNAP/RCk/veO6dXiimamtD6CsaDi
 V8tkSE6S4kJ1Iq6xiIAlx15zelsVNLXz3tOW/z4tAhd7TQRvriVeRWB7HjFOrOuX
 HVYdxs4rZ60PUb2CzOhnk0NONsdvYo1OihMqJ3nxVD0GYz3s4V44Nzh56aUfHFXA
 ybAxPDvMosObyx25XsXvXJmwHq2Bmsy1aZp1k6wEPKPqaoJDgfr05pYJEy+5YkuH
 yUQkP4J2zxeuehQemp12uxrekUyQ9Xj9YoAzENx8WTRvsbp6rTJzcid4bjN9wJXB
 pyebTlWjc9lqvzaw6sBDoOGnoC2F2Nzrq+SvzQB77WRRz9fiDfGfp7w9dg2D1ZmW
 IosNJjXbFFj7VVfpjsDmKQWPQMWfM7PV5lgttTC850yW44wHKf/PsRqZc43KeLrv
 Za+xFaL1qS1dBc5tP+23+i5njFxGraC5NOR5YIYOnvfUJDiFMaJ7qc9kX6qXdBxr
 R1aEg8jkxK3B+9z/EX26eHzeKOXmPqno5t9TrZFjNjgX1JYt4cc9Sbz7ZipVH3V/
 QEUsKAgvLQ7bW1eXlRxYNWPPKIMaQmMlAr3xBANF+rCflBY18KDeJKK81Hv+DRAK
 rfkfstNVAKM2jRzgEUsYjPmSnO2KJJ3PZjEemWAw6nSvZaNdz9NcmimmGXB9DuKi
 vIXvW5NUL1d547vLE0ct
 =X10j
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "Another small, clear fix in a specific driver."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: tps65910: Configure correct value for VDDCTRL vout reg
2012-03-08 17:25:17 -08:00
Linus Torvalds
ee0849c911 Minor bug fixes and documentation updates for v3.3-rc5
Fixes up a duplicate #include, adds an empty implementation of
 of_find_compatible_node() and make git ignore .dtb files.  And fix
 up bus name on OF described PHYs.  Nothing exciting here.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPWPvsAAoJEEFnBt12D9kB8QsP/2e6jv7CQ/wUhihzw126Y8yi
 4HLG3ByJ382/c3LXg3juC+WfMvlpqEa0yqhikFKgIsm7cNbv7PkPS583pujdSXo0
 IRykiT3Q9+XbO7L+9C88miYIQT8+IT/AjIjfuRwlP4gLPNUJhknpYhYOc09YzwOp
 zhheeGVyeyTay+beQXip8gOEq2dYEl4IKsItagCBZfkDvj3Y8yDwTlP7f2j0FYZi
 uODa3NFKE4uc0U2chtro0Vt87TRfbnIQ93SbvEWyQBWMEbPqT3l1Q94AeFGubCLj
 RX910WvurCE4evzo2ZJzXPt9gPEyGIgtMGbjh+cENfT5/wNB2HWk1ftKnss5yEjp
 WmcbwWKwXPwRPc5EpRdgabBCRgouCZghtcnJpkoXKSwbEt/nj3no8GOZXQnv+rSL
 Ga0BSk1jJYC9DRpaIrLvdxXZF7vy1nug8fgU9ALi2R0gTmN+k6tHlLh60EWMSCBF
 YCXv3fUd9IVJfOYsu4qMk0Pu40pW9rDc698Nxiep7C0iNhlddq8fiHTwB3DdoinC
 iYmz+wEdDDp/68qY/mguXXtr5GAFSz1PUOwSJkh2zW6LnBDEv1djfhoA+5gwZX2v
 I1IwfNHMn/XAEuhOFWMo9CejB0rDoBhigwucH8EilbIdBYVCCEtYaubO/Gq6HmId
 h7hx9sWjrwF9HY0N9EHX
 =/fxH
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull minor devicetree bug fixes and documentation updates from Grant Likely:
 "Fixes up a duplicate #include, adds an empty implementation of
  of_find_compatible_node() and make git ignore .dtb files.  And fix up
  bus name on OF described PHYs.  Nothing exciting here."

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  doc: dt: Fix broken reference in gpio-leds documentation
  of/mdio: fix fixed link bus name
  of/fdt.c: asm/setup.h included twice
  of: add picochip vendor prefix
  dt: add empty of_find_compatible_node function
  ARM: devicetree: Add .dtb files to arch/arm/boot/.gitignore
2012-03-08 17:24:27 -08:00
Linus Torvalds
7d77696e92 SPI section mismatch bug fix for v3.3-rc3
Russell King (1):
       Fix section mismatch in spi-pl022.c
 
 Minor fix for pl022_dma_probe() function which was put in the wrong section.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPWPqWAAoJEEFnBt12D9kBI/QQAJ+4yq7ZB9GAh4iEvfpbZuvZ
 28pO17EBdNo/gcWk7JLRLpgd5pCmcxFUgwcVmALm8n0yW/sgg6xpKBh3Mi3NKkGl
 EcFCEBJr6G1eS/xZwxjtgPyZMrNyMh0EuyxSn6yBZrCZ/qrQp/yRO78Cxlx4ppV5
 rve4PeQ2DMtdfpwMfNxs5vDulzK404Bq5EwPyV4jMr33eCkvEfsBSSDCJoo57jLv
 YBguqzT5VmGvyZ47Gk5ufkSwx29oFv0wtQcvnTtAISH7wgIwc83dvRHhgguaFNjt
 BKVk/Pt75TTKJ/sfMbnXFQwqCeGbwa+2ft+6Q1rSv+Uj2tzwkrf+iEG51UGTMfJF
 eESU9VcFNlzxanQNsOUOiyDe72mYeD7KGh6Z12vliGLshgnBiZOn0sUYQiTj2KB/
 0MDTcPZHAikg4GVdQfsPhXAYt6bYEHL2loqtY9GJB8MSdUZHxP5+/QWl9P4zpXp0
 yfuzwtCjuqS+dq7DE0VCCOISzcxRFJHmoOEXFP1qMnn3DQGairk1yVbkHJlJOCRq
 GlfeK1V94EpG2J7oYIP77EufrkY2ASDSLD2jk90SCkeBh7EyOchHHSKm8ehS+P6C
 2UsTJ1PdevP1lsR/cmQDhEmENrwHMSD8Uc06DzO8OYrRFiL8cU0uF1pvFTEzsR64
 ou8ljS0OkspLQKf24SIi
 =nTfV
 -----END PGP SIGNATURE-----

Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull SPI section mismatch bug fix for v3.3-rc3 from Grant Likely:
 "Minor fix for pl022_dma_probe() function which was put in the wrong
  section."

* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  Fix section mismatch in spi-pl022.c
2012-03-08 17:23:45 -08:00
Linus Torvalds
42a6c7ef3b Four patches since v3.3-rc6:
1bd612a hwmon: (jc42) Add support for AT30TS00, TS3000GB2, TSE2002GB2, and MCP9804
 7ad6307 hwmon: (zl6100) Maintain delay parameter in driver instance data
 7cb3c44 hwmon: (pmbus_core) Fix maximum number of POUT alarm attributes
 4de8612 hwmon: (jc42) Add support for ST Microelectronics STTS2002 and STTS3000
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJPV7njAAoJEMsfJm/On5mBJQAQAJeqsVDIdsYAok5DUcF3NLYV
 CRhrNWSrGvrz0w9+dovn3R0kIP/mP5XeAsVswokDbVlXL6JYa1jJuhQo7WvrA1sI
 OEOGe9E9ypH7Bol+BtH293TJRkGbTRfmi2c8vjnkNYAXNoYvXKd91ZUhN+lhoLxW
 BqGgHF3wgZZlw3jrCJC78Glmnu4fmy18g2oUqJbPgdtBRh2coDw0i1cvAPM8ud/b
 Aqpb+DF2Im2qMFgiflaWkB0fU/VJSxQMDYtgguOVeGTIuxCGVihwM2zGjJtieHRD
 w4stWfUWqhotBlsDlfrTnhgZZ4dFYvxo98as5HhWnzMdwwOoiwx2Nhv4M1ZpCi1d
 KIJrGKpsgVTGL2b6PNJhAusCO8slRBJ44sb49hTk1baFd9YuerqHReUw5/Y0+Jk/
 8juOKOVoX27DW3rZkq7KX6AEdDB9l3dnd3D8sSLyPmMeRVVuxUsQjK7PCbDtjQ6u
 BASNRGAp4T7+pGhq2yQmDAoawDPYesa5hg/6XD2x5iHCdI/TMxAHgP0qVkfiOcYX
 IxYS/rgeBoH42+Aak5XXwafbTfdiWaRxpMnRnwgaTUxaOWu3GauIQb5TzWZMQTKm
 xwZkxoS/XVDADXCzZxoCop4nbYnFwqKq2p+icwleSuVyN/+hI2lsz8wdsYFFWKnZ
 ApulwsYjhhtrtDFRysiB
 =mjD6
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull four hwmon patches from Guenter Roeck

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (jc42) Add support for AT30TS00, TS3000GB2, TSE2002GB2, and MCP9804
  hwmon: (zl6100) Maintain delay parameter in driver instance data
  hwmon: (pmbus_core) Fix maximum number of POUT alarm attributes
  hwmon: (jc42) Add support for ST Microelectronics STTS2002 and STTS3000
2012-03-08 17:22:54 -08:00
Linus Torvalds
5d0edf2915 Device-mapper fixes for 3.3.
Eight small device-mapper bug fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPV8+yAAoJEK2W1qbAHj1nZVAQAI8TNKwnBpKSW3Y9XFHqWjEx
 71wbjDkKkdEUWy52CAkSoRnQdX+ABxxGr5R60n/vJvHi4yDse56LddPzKAo4zD3c
 DVh6RB8CTIY+2IXGzjkDtelmKogKyAMlhmRoj0oLb5/29n6lnn6A0vkq4OimuFJO
 IIdgJxpRLqmV8NcSVC7qCEoErxzTNz9w7HaBBs73VhF8AcN/6Qi/z55zDOzT/Iz8
 iMHGmOHJBb8OxMN8BWWFdDh2YUz3isbM1xbBerYxy3P3WCHpxGBt7yRiHm3Yd5il
 USnJN3Kz0w6Orhgu1eeAuJz1A9cdSP62AQDdM91+v3nHz3mtTdAljmJZgzgzqs5u
 SRO24J6FD201DNh/RitDC1UzNOBqeapfqprT/gH+qM4Pl6X+vuXiSe5cxx+lTOhJ
 GErI1XYpTfzymdpQfqj6VnDMevRf0Hz+mSjEiUh8qjUv9bXHkmTrzjxCvAIEM+4h
 fJSQ0Fp77eV7Du9HkkFbEXVTYOe8VO+6E9AaplBAjZxHS6w+5tMFkHTM28JPxS98
 rYAks9QKbaZaEYZiNv7htux8n2OS9IeGHdLQpsooLh6lD4GxvBJ7NC8wUkfUzn27
 zEr2vqAYuA3PiccSHnT7tlN0PN1JlOjDCf+cdQkKfJj5w0E/qS2Fiv2UFIRLRPEa
 blSbf7wU0mpvorQJn/bd
 =lLJB
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper fixes for 3.3 from Alasdair Kergon

Eight small device-mapper bug fixes.

* tag 'dm-3.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm raid: fix flush support
  dm raid: set MD_CHANGE_DEVS when rebuilding
  dm thin metadata: decrement counter after removing mapped block
  dm thin metadata: unlock superblock in init_pmd error path
  dm thin metadata: remove incorrect close_device on creation error paths
  dm flakey: fix crash on read when corrupt_bio_byte not set
  dm io: fix discard support
  dm ioctl: do not leak argv if target message only contains whitespace
2012-03-08 17:21:51 -08:00