Tool predicts how fast code will run on a chip: Machine-learning
system should enable developers to improve computing efficiency in a
range of applications by Rob Matheson, Massachusetts Institute of Technology.
MIT
researchers have invented a machine-learning tool that predicts how
fast computer chips will execute code from various applications.
To get code to run as fast as possible, developers and compilers —
programs that translate programming language into machine-readable code —
typically use performance models that run the code through a simulation
of given chip architectures.
Compilers use that information to automatically optimize code, and
developers use it to tackle performance bottlenecks on the
microprocessors that will run it. But performance models for machine
code are handwritten by a relatively small group of experts and are not
properly validated. As a consequence, the simulated performance
measurements often deviate from real-life results.
In series of conference papers, the researchers describe a novel
machine-learning pipeline that automates this process, making it easier,
faster, and more accurate. In a paper
(PDF) presented at the International Conference on Machine Learning in
June, the researchers presented Ithemal, a neural-network model that
trains on labeled data in the form of “basic blocks” — fundamental
snippets of computing instructions — to automatically predict how long
it takes a given chip to execute previously unseen basic blocks. Results
suggest Ithemal performs far more accurately than traditional
hand-tuned models.
Then, at the November IEEE International Symposium on Workload Characterization, the researchers presented
(PDF) a benchmark suite of basic blocks from a variety of domains,
including machine learning, compilers, cryptography, and graphics that
can be used to validate performance models. They pooled more than
300,000 of the profiled blocks into an open-source dataset called BHive.
During their evaluations, Ithemal predicted how fast Intel chips would
run code even better than a performance model built by Intel itself...
Next, the researchers are studying methods to make models interpretable.
Much of machine learning is a black box, so it’s not really clear why a
particular model made its predictions. “Our model is saying it takes a
processor, say, 10 cycles to execute a basic block. Now, we’re trying to
figure out why,” Carbin says. “That’s a fine level of granularity that
would be amazing for these types of tools.”
Read more...
Source: SciTechDaily