Saturday, March 10, 2007

A better music player (Part I)

As a college student studying engineering and music performance, and as a loonix geek, I've spent far too much time thinking about music players. Some of the first hacking projects I worked on in high school were tools to sensibly shuffle and play my music library, which I approached using primitive semantic grouping. Now I'm more interested in the GNOME desktop, user interface design, and integration as well. I'd like to combine some of these ideas to take music playback on the GNOME desktop beyond iTunes.

In this frighteningly long post, I'm going to outline my rationale for why I plan to do it.

State of the Music Management and Playback: Then and Now
First generation music players play audio files dutifully when our intrepid user opens one with a file manager. They show you the awesome metadata for each "song"¹ such as its artist, title, album, and so on. Lots of once-popular audio players (
winamp, wmp, xmms, quicktime player, etc.) did this. It was cool. Then, a point came when people started realized that a track's metadata is more useful than its file name, because file names are typically an inconsistent jumble (and redundant with the metadata anyways). Searching through directories full of this sort of file is a pain unless you're obsessive obsessive enough (like, say, me or purple_cow) to go through and sort things using some sort of recursive renaming tool and then tweak by hand.

Then came Apple's iTunes. It became popular very quickly, and ushered in a second generation of music players. How? Well, sure, it integrated nicely with the iPod, and integrated that crazy internet shopping thing, but it also took the metadata feature a step further, and made those jumbled file names invisible to the user. The only contextual track information exposed to the user is metadata. The file names no longer matter. This turned out well, and a whole slew of GNOME applications have immitated this idea to some extent: rhythmbox is the most obvious one (since its goal seems to be to replicate iTunes as closely as possible) but to varying extents, banshee, quodlibet, listen, and muine do too.

Well, cool. Those projects have done a good job at that, but I feel this approach will be superceded soon, and that we might as well try to be ahead of that curve so in GNOME. Why?
  • Beagle and Tracker already index music metadata. Why index them separately?
  • None of the existing GNOME music players integrates all that well into the desktop infrastructure (nautilus, menus, the panel..) yet.
  • Shuffling has always been horribly sloppy in every music player I've ever seen. This is not an exaggeration. We can do better.
¹ This term is perpetually inaccurate in music-playing software. I don't care how hip you're trying to be, or what your target denominator is. A much less annoying word choice is "track." A "song" is a specific term with a specific meaning which is not "the stuff in this sound file." A symphony is not a "song," a jazz tune is not necessarily a "song," so please don't write software that assumes every track in my library is a song. Thank you.

Bone to Pick #1: Metadata indexing
The second generation of music players was ahead of the desktop curve in terms of the metadata/tagging craze. Desktops are slowly evolving in that direction; in GNOME this is driven by the beagle and tracker indexing tools. If this trend continues, file names may no longer be relevant a desktop user within a few years. Awesomely, these services also index music. Yay! This means we shouldn't need to do that separately in music players any more, right? Jamie McCracken has stated plans to start implementing this via a tracker backend into rhythmbox soon, but in the meantime consider this (very nonscientific) survey of system resources consumed by some GNOME music players for my ~5000 track library:
  • Memory consumption on startup:
    • banshee: 62M
    • listen: 78M (!more than firefox with 4 tabs open to fancy JS pages)
    • quodlibet: 46M
    • rhythmbox: 32M
  • Disk space consumption of library index:
    • banshee: 7M
    • rhythmbox: 3M
    • quodlibet: 2M
    • listen: 7M
Bone to Pick #2: Shuffling
This is horribly sloppy in every single player I have ever seen or used. Music players have supported this feature since they first appeared, so it's kind of impressive to me they still can't do this right. Even today after 10-odd years of a world of tagged MP3s, every player I have seen is still horribly sloppy at this. A proper shuffler should be able to semi-intelligently group tracks together, and insert an appropriate brief delay between tracks and groups. Pretty much like a DJ. Yet I've still only ever observed exactly two different kinds of grouping, and they are both rigid and simplistic.

The most common approach (in, e.g., iTunes, rhythmbox, banshee, listen, and quodlibet) is to select a random track out of the library's overall list of tracks. With this scheme, it is inevitable (in my library certainly) that tracks are chosen from somewhere in the middle of a symphony, that 30-second "Stop" track from The Wall, a spoken-word segue in a Kanye West album, or some other track that generally don't make sense out of context. This drives me crazy. Unless your music library consists mostly of Brittney Spears-alikes or random singles (which is admittedly actually kind of common), this approach to shuffling will not be very satisfying, and requires frequent intervention by the user.

The second common approach has been to select and play a random album (as in muine and quodlibet). This bothers me less, except that it's still not what I always want. First, any tracks I have without an album tag will typically get ignored. That's not good. I want a shuffle feature to be able to reach my entire library. Secondly, if I try to listen to certain sets end-to-end (e.g., any Beck albums, or Bach or Shostakovich Preludes and Fugues) I'm likely to go slightly crazy and move on.

I'm not satisfied with either of these approaches. I want to be able to segment these into smaller groups (preludes paired with fugues), or even play them as separate tracks (as for the Beck albums). More generally, different genres ought to grouped and treated differently by default, according to the traditions of that genre. Multi-movement classical pieces, modal jazz albums (hoo-ray Miles Davis!), and progrock typically require the context of the rest of the group in order to make sense. Folk music, most rock, big band jazz, and many ethnic musics are sometimes placed on an album for convenience rather than artistic cohesion, and users may be used to hearing them as singles on the radio, so there is often little reason for them to be grouped together. Clearly these defaults are rough stereotypes; it would still be necessary to provide a mechanism to tweak grouping individually. I don't know of any tools for GNOME that do this right now. It also impresses me that there aren't any.

The last shuffling issue I'd like to address is that of song transitions. This is another situation where traditions vary by musical genre and artist: typically in classical music you don't want to add any extra delay in between movements, because (especially for attaca endings) it will fubar the flow, but every player I've used needs a part of a second to load the next file. On the other hand, it's traditional in many popular idioms (and can sound rather nice) to let a finished song hang in the air for a second or two between tracks, or to crossfade (I understand this is possible in Amarok.. why shouldn't GNOME players be able to do this?).

Bone to Pick #3: Desktop Integration
And now the fun part. Music playback has become a staple task for desktop computers. Desktop instant messaging developers recognized the same fact about their arena, and started the telepathy project, yet I haven't seen anything like this for music playback.

We can easily avoid the monolithic (and frankly often sluggish) iTunes-alike music playback approach by adopting a decentralized dbus-based music infrastructure remeniscent of the telepathy design:
  • A playback application, perhaps small enough to show some information in a panel applet, and to popup some controls when clicked - not unlike the calendar in the clock applet. Its jobs could be to
    • Prioritize playback requests from other components (or let PulseAudio do this?)
    • Handle *all* aspects of playback using gstreamer
    • Higher-priority playback requests preempt lower-priority playback requests by pausing, fading or muting to background
    • Lower-priority playback requests queued for playback after higher-priority requests
    • Not need to know anything about music libraries.
    • Pause when *anything* (e.g. PulseAudio preemption requests) happens
    • Mostly sit in the panel and respond to control and information requests.
  • A shuffling app in some window to configure and generate random playback (like a radio DJ) appropriately for the style of music.
    • Play ambient with a low priority; requests probably preempted in the player when specific files are requested by other components
  • A metadata search app - why not do this in nautilus using beagle/?...
    • Sends requests to play selected music
    • Could be inside, i.e., music:/// URI in nautilus?
  • A podcast/internet radio browsing app - could even be firefox+google reader..
  • A tagging/metadata management app?
Conclusions
Mostly in this post I've focused on what I'd like to do. In Part 2, I'll discuss more about how I plan to do it! I'd love feedback from anyone (especially in the GNOME users and developers) who read this, as I'd like to start into this as a Google SoC project.

2 comments:

Evan said...

I have given it some thought, and as promised I now have formed an opinion.

By and large, I like where you're going with this. The idea of having more integrated music playback in a desktop environment sounds good to me. I've long hated the tray icons of my music players. I usually want them to just play music and leave me alone; the only time I ever want to look at them is when I'm choosing what to listen to next. What makes me nervous about some of your suggestions is all the little window turds that could become involved in playing my music---your suggestion of a shuffle application comes to mind here. I'm also disinclined to further clutter my panel with something just for music playback, especially on my 12" laptop screen. That said, desktop integration with the obsoletion of file names sounds great to me.

I know we've discussed shuffling before, but in the interest of completeness I'll make a brief comment here. Right on. Shuffling should be done more intelligently than just "shuffle all my tracks" or "shuffle all my albums." Some tracks need the context of the album (and I would personally venture that that is as true of Beck as it is of a symphony) and some don't (e.g. a greatest hits collection).

Death to iTunes!

lurgy said...

I agree. I think I still have a lot of brainstorming to do for the interface bits. Suggestions welcome ^_^