Timothy Jones's blog

Software Prefetching for Indirect Memory Accesses

23 February 2017

I’ve always considered software prefetching to be something of a black art. There have been times in the past when I’ve looked at my code, noticed a load is causing problems and tried inserting one or more software prefetches to alleviate the issue. Mostly this hasn’t worked, although I’ve never been sure why. In fact, even when it has worked I haven’t been totally sure why it has, usually because it’s involved a lot of trial and error in trying out different options before I hit on improved performance.

Now it turns out that most of the time I was probably trying to prefetch the wrong things. Trying to prefetch linked data structures, which are those that involve pointer chasing (like a linked list),...

My Year 2016

23 December 2016

I started blogging in October 2015 with the aim of publicising my group’s research a little more, having a space to write about topics and work that weren’t going to be published, and delving into our research results in more detail than possible in a page-constrained article. A year on, I wanted to look back and see how things had gone in my first year as a lecturer, but the events of this October overtook me. Now, at the end of the calendar year when everything is calmer, it seems like a good opportunity to summarise the last twelve months and point to blog posts written and events that I didn’t find time to talk about. Given the season I’ll try not to make it...

Negar Miralaei

8 November 2016

It’s with immense sadness that I write about my PhD student, Negar Miralaei, who died in an accident back home in Iran on the 26th October. She will be sorely missed within my group and by everyone within the Computer Laboratory. Many tributes have been paid to her warmth, kindness and dedication to her research. You can read those from the Computer Lab and Varsity , for example.

Despite this being a short post, it has taken me nearly two weeks to write, such is the shock, sorrow and emptiness I have felt since being told the tragic news. Negar was a ray of sunshine in our group, always smiling, even in the face of adversity. She was...

The Lynx Queue

9 August 2016

This post is about my group’s second ICS paper from June this year, which describes a new single producer / single consumer (SP/SC) software queue that we developed for frequent inter-core communication. It’s faster than existing implementations and we call it Lynx . It’s available on my group’s data page .

Initially, we didn’t set out to create a new queue. We were experimenting with transient error detection techniques in software. Transient, or soft, errors are faults that occur sporadically within a microprocessor, causing a data value or instruction to change. They are the result of strikes to the chip from cosmic rays (or usually the secondary particles they excite) or alpha particles from...

Hardware Graph Prefetchers

3 June 2016

This week sees the publication of two papers from my research group at ICS 2016 and so, in this post, I’d like to look a little more into one of these schemes: the graph prefetcher that my student, Sam, has developed.

Graph workloads are important in a number of domains, and becoming increasingly so. You only have to look at the numerous social media applications to see examples of graph-based data (e.g. in a network of people, each person is a vertex and the edges represent links to friends). But graph representations are also significant in less publicly-visible application areas, such as those in scientific computing or “big data” analytics. However, efficient processing of graph workloads is often...

Minute Madness on Program Parallelisation

25 May 2016

Today was the annual Wheeler lecture at the Computer Laboratory, and before the main event, a talk by Andrew Herbert, there was a Minute Madness where people from across the Lab, ranging from MPhil students through to professors, talked for one minute about their research with a single slide as a prop. My slide and something approximating the words I used are below.

“Hello! My group works on ways of making applications go faster, through a technique called program parallelisation.

If you look on the left of the slide, the red wavy arrow represents a regular sequential application with a single thread of execution within it. This means that instructions execute one...

Panini FIFA Sticker Collections

14 March 2016

My eldest son likes collecting things. Of course all children seem to like picking up random objects and hoarding them forever, and we’ve had our fair share of leaves, stones, food wrappers and other assorted paraphernalia floating around the house until we can divert attention elsewhere and get rid of them. But my son also really likes collecting sets of toys, books and, currently, stickers.

He first got into this during the World Cup in 2014 when I thought he was old enough to really enjoy collecting the stickers of all the players and teams that you find in the famous Panini sticker album that’s published before each competition. I remember collecting these when I was growing up; the excitement at opening each pack...

Alias Analysis in HELIX

21 December 2015

One of the most important parts of our HELIX compiler is the data dependence analysis we run on the compiler’s IR to determine which instructions are independent of each other. You can read more about HELIX in general in our original CGO 2012 paper (click through my publications page to get free access to the ACM version).

HELIX’s initial data dependence pass is split into two phases, and it’s the memory alias analysis stage that is most interesting. This has the job of identifying the locations in memory that are read and written by each instruction so that we can respect all data dependences within the loops we parallelise. Since alias analysis is not precise, we need to...

Boosting Performance By Limiting Vectorisation

27 October 2015

It sounds a bit counter-intuitive, but boosting application performance by limiting the amount of vectorisation carried out is essentially what my postdoc, Vasileios Porpodas, and I have done in our latest paper on automatic vectorisation. We call it TSLP, or Throttled SLP, because it limits the amount of scalar code that the standard SLP algorithm converts into vectors.

The actual paper is available here . Vasileios presented it at PACT last week and will be at the LLVM Developers’ Meeting this week, so I thought it might be interesting to expand on one of the examples we give at the end, showing the source code and how it is actually vectorised with SLP and TSLP. The kernel is compute-rhs, which is a...

Hello World!

1 October 2015

Today is the first day I’m officially employed as a University Lecturer , so it seems appropriate for my research blog to be born right now. It’s going to be a bit of an experiment (as you’d expect), but hopefully a place where I can describe the research my group is undertaking in more detail, in a different way to how it’s presented in our papers, and with interesting results that we don’t intend to publish in any conference or journal. My aim is to make the research more accessible and informal than our academic articles. Let’s see how I get on!

Cheers Tim

Software Prefetching for Indirect Memory Accesses

My Year 2016

Negar Miralaei

The Lynx Queue

Hardware Graph Prefetchers

Minute Madness on Program Parallelisation

Panini FIFA Sticker Collections

Alias Analysis in HELIX

Boosting Performance By Limiting Vectorisation

Hello World!

Latest posts

Tags

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge