Independent Films by the Numbers

The marketing of Independent Films

Language Structure of Festival Titles

I ran the collection of festival movie titles I have gathered through a part of speech analyzer. My goal is to use this tagged information to help refine some future performance analyses I plan on doing. The results by themselves are interesting.

First, a disclaimer needs to be made about POS tagging using an automated piece of software. It is not perfect, especially in the case of tagging part of titles, which tend to lack rich contextual hints given their brevity. If fact, I had to trial a few POS taggers before I found one that worked to my liking. I think that nouns (NN) are over represented by the fact that when in doubt the tagger tags a word as a noun.

Also note, that I had to relax the cues for Proper Nouns vs. Nouns given the standard structure of titles, which captialize most words. As such proper nouns are not reflected in this analysis and were treated as standard nouns.

The results are as follows for the top 25 title part of speech structures:

key: Format = [RANK 1…25] [POS Tag] [Count] [% of Sample]

tags: NN = Noun; NNS = Plural Noun; JJ = Adjective; DT = Determiner; POS = Possesive Ending; IN = Preposition; VBN = Verb Past Participle; VBG = Verb Present Participle; CD = Number; RB = Adverb

1 NN 526 (14.66%)
2 NN NN 248 (6.91%)
3 JJ NN 184 (5.13%)
4 DT NN 161 (4.49%)
5 NNS 90 (2.51%)
6 JJ 72 (2.01%)
7 DT NN NN 69 (1.92%)
8 NN POS NN 53 (1.48%)
9 DT JJ NN 52 (1.45%)
10 DT NN IN NN 46 (1.28%)
11 NN NNS 46 (1.28%)
12 NN IN NN 44 (1.23%)
13 JJ NNS 40 (1.11%)
14 NN IN DT NN 34 (0.95%)
15 VBN 29 (0.81%)
16 NN NN NN 27 (0.75%)
17 DT NNS 24 (0.67%)
18 NN CD 23 (0.64%)
19 NN CC NN 22 (0.61%)
20 RB 21 (0.59%)
21 VBG NN 21 (0.59%)
22 NN VBG 21 (0.59%)
23 NN VBZ 19 (0.53%)
24 NNS IN NN 19 (0.53%)
25 CD NNS 18 (0.50%)

Comments are off for this post

On the Naming of Independent Films

My initial survey of the names for Independent Films showing at film festivals this year is complete. What I have discovered is that that there is skewed distribution of name lengths with a mode (the highest frequency for the distribution) at 10 characters including spaces. Some films of course have much longer names… as much as 82 characters. See the histogram below…

Length of Names for Independent Films

Take this information with a grain of salt…this does not necessarily suggest that this is the best length for name. It does tell us the norm for naming films. What remains to be seen is if the length affects performance of these films on the festival circuit.

My assumption is that it does, since this length matches the best practice of branding, which is to create easily read and remembered names. Ten characters forces the filmmaker to use one or two words, which will keep naturally work better for branding purposes than longer names.

The next steps in this analysis will require more data including what films won or lost at festivals (audience and jury awards). My hypothesis is that shorter names win more awards than longer ones. We will see.

Regardless, understanding what works and does not work in naming your film is critical for successful marketing of a film, since this is the first thing typically a perspective viewer will see or hear about your work. A good name will make all the difference in grabbing an audience. There is much more work for me to do on the topic.

Comments are off for this post

Why do Runtimes vary between Film Festivals?

I have spent the last few days pushing forward on getting a bunch of new festivals into my database. What I have discovered is widespread variation in reported runtime for the same film playing in different festivals. Why?

It is unclear in the data I have, but here are my thoughts on possible causes…

  • When a film premieres at a festival it is not uncommon for the film’s editing not to be complete when information for the festival program is requested. The delivered film may vary by a number of minutes from what was estimated earlier by the filmmaker.
  • Different festivals round partial minutes differently from one another. For instance, if I have a 5.5 minute film, one festival might consider this a 5 minute film while another will see this as a 6 minute film
  • A festival run will often times expose a problem with the original edit which will need to be corrected by removing or adding minutes of content for the next screening. There is nothing like a live audience to vett out your filmmaker choices.

This raises a bunch of issues for databasing festival films. First, you cannot trust runtimes as a means to normalize films between festivals… just because two films do not have the same runtimes does not mean they are not the same film. Second, in a database of films showing at festivals what time should used or should you track all versions? No good answers… IMDB seems to have gotten around this by explicitly stating where the runtime was measures (x minutes @ the Toronto Film Festival).

Comments are off for this post

Understanding Connections between Film Festivals – Analysis

What does the data and chart presented in the previous two posting actually mean? Here are my insights:

1) Sundance has the advantage of being early in the year for this model, so it can feed other festivals. Given my analysis of data is only from one year, it is possible Toronto’s importance is under-rated in this analysis.

2) Toronto appears to avoid films shown at Tribeca as indicated by the non-existent relationship (0.000) between the two festivals. I don’t think this is an accident… both festivals have brands tied to premiering high-profile, celebrity-filled films in the tradition of Cannes. Sundance’s more independent film brand is less of the threat to both of these festivals, so it is less risky to have overlap in their programs. The timing of release also plays into things with Tribeca coming just before the summer and Toronto at the start of the Fall, which should enable them to premiere films intended for very different seasons from one another.

3) Sundance, Tribeca, South by Southwest (SXSW) seem to exist as feeders of other festivals. My personal experience with screening a film at Sundance concurs with this, since doing well at Sundance meant countless festival knocking on our door to screen our film rather than us approaching them.

4) The strong connection between Woodstock and the Los Angeles (LA) Film Festivals strikes me as two festivals seeking similar films that have done well at other festivals. These festival exist at either ends of the country, so they really do not complete with one another.

5) Newport seems to be strongly influenced by SXSW and Sundance; while having a similar catchment of films to San Francisco.

6) Not surprising, LA and San Francisco (SF) tend not to follow one another too closely. Is this different styles or a choice? It is unclear. As former resident of the Central Coast of California, I would guess it is style.

7) With more festival loaded into the matrix, this analysis could become very useful when planning the festival lifecycle of new film. You could be able to understand where are the best films to premiere and where next to expend your efforts trying to get screened. Every dollar you can save on not applying to festivals that will probably not screen your film would be good. But more data is needed including cross-year data to understand the life of a film from one year to the next.

Comments are off for this post

Understanding Connections between Festivals – Raw Data

As seen in the previous post, relationships between festivals…

Festival Network Map

The raw data Behind it…

Sundance

    SF 0.124
    Newport 0.120
    SXSW 0.102
    LA 0.081
    Woodstock 0.060
    Berlin 0.048
    Toronto 0.031
    Tribeca 0.022
      Berlin Film Festival

    SF 0.061
    Sundance 0.048
    Tribeca 0.042
    Toronto 0.039
    Newport 0.038
    LA 0.031
    SXSW 0.031
    Woodstock 0.005
SXSW

    Newport 0.126
    Sundance 0.102
    SF 0.047
    Tribeca 0.047
    LA 0.045
    Woodstock 0.042
    Berlin 0.031
    Toronto 0.008
Tribeca

    Newport 0.118
    SF 0.069
    LA 0.063
    SXSW 0.047
    Woodstock 0.047
    Berlin 0.042
    Sundance 0.022
    Toronto 0.000
San Franscico

    Sundance 0.124
    Newport 0.099
    Tribeca 0.069
    Berlin 0.061
    LA 0.047
    SXSW 0.047
    Woodstock 0.022
    Toronto 0.013
Newport

    SXSW 0.126
    Sundance 0.120
    Tribeca 0.118
    SF 0.099
    LA 0.049
    Berlin 0.038
    Woodstock 0.020
    Toronto 0.006
Los Angeles

    Woodstock 0.100
    Sundance 0.081
    Tribeca 0.063
    Newport 0.049
    SF 0.047
    SXSW 0.045
    Berlin 0.031
    Toronto 0.007
Toronto

    Woodstock 0.046
    Berlin 0.039
    Sundance 0.031
    SF 0.013
    SXSW 0.008
    LA 0.007
    Newport 0.006
    Tribeca 0.000
Woodstock

    LA 0.100
    Sundance 0.060
    Tribeca 0.047
    Toronto 0.046
    SXSW 0.042
    SF 0.022
    Newport 0.020
    Berlin 0.005
Comments are off for this post

Understanding Connections between Festivals – Part 2

Coming off my earlier simple analysis of festival relationships, I have gotten more sophisticated…

My trouble with the analysis at Woodstock was it did not encompass an understanding of differences in sample sizes between festivals and therefore did not allow for easy comparison of Woodstock to other festivals. For instance, one can assume that a festival that came before Woodstock and showed 200 films would have a better chance of sharing films with Woodstock than other festival showing half as many films… all things being equal.

I decided to be more rigorous and cast my net wider to improve my approach to this problem. I turned to matrix math to resolve some of my issues of sample size using cosine similarity. For those you missed this strange mathematical turn in high school or college, matrix math is the math of grids. In our case, it is a grid of films by festivals with cell values of 1 for a festival that showed a given film and 0 for all other cells.

By plugging these cell values into some simple formulas in Excel, I am able to calculate a value of 0 to 1 representing a cosine value of the angle created in n-dimensional space by a plotting the film vectors of each festival. In layman’s terms, I can calculate a number between 0 and 1 that measures how similar each festival is in terms of what films they showed and did not show. A value of 1 represents a perfect match, while a zero shows no relationship.

For instance, the results for Woodstock in descending order are as follows:

    LA    0.10
    Sundance    0.06
    Tribeca    0.05
    Toronto    0.05
    SXSW    0.04
    SF    0.02
    Newport    0.02
    Berlin    0.00

Once I had these results I was able to generate a simple festival map showing which festivals had above average (0.052) cosine scores in the population. The red arrow indicate strong relationships (more than one standard deviation +/- 0.035) and the blue are weaker greater than average relationships. The arrows are temporal indicators of which festival can first. In the case of two festivals occurring at the same time no arrow will be used.

Festival Network Map

Toronto is an orphan, while Berlin and Sundance clearly are feeders of other festivals. I will show more of the numbers behind this analysis later.

Comments are off for this post

More on Festival Film Run times

My previous work on run-times at film festivals has been well-received. The insights have surprised many, especially those film students who have been instructed to make 30 minute pieces, which do not seem to fit programming schedules all that well.

It has been requested that I break my chart apart by Fictional and Non-Fictional films. So here it is…

Festival Film Run Times by Genre

As you can see, the two charts are very similar. Non-Fiction films tend to run shorter for features and have a wider dispersal for short films than Fiction films. Some of this is dispersal differentiation for short films is probably due to sampling effects.

Please note, not all films in my database were included, since I have not classed all records into fiction or non-fiction.

Comments are off for this post

Understanding Connections between Festivals

I spent the weekend up in the Hudson Valley at the Woodstock film festival. The leaves were turning colors and the movies were good. I got a chance to see “Moving Midway” and “Operation Filmmaker.” I especially liked Nina Davenport’s “Operation Filmmaker” — a gusty exploration of the filmmaking process.

While there, my thoughts soon turned to how films move between festivals. All festivals love premieres, but you would be hard pressed to program a festival of only premieres. There will always be plenty of great films that you want to show that premiered elsewhere… a fact that is even true for the biggest festivals.

Woodstock offers a great place to study the inter-connections between films and festival, especially since I have database of all the films that have played Sundance, Berlin, South by Southwest (SXSW), San Francisco, Newport, LA, and Toronto this year. Additionally, Woodstock is a very nicely programmed festival.

Thirty-nine of the 113 films playing at Woodstock this year played at least one of the festivals mentioned above before coming to this festival. The counts for each are listed below in order that the festivals played this year:

  • Sundance – 9
  • Berlin – 1
  • SXSW – 6
  • Tribeca – 8
  • San Francisco – 2
  • Newport – 2
  • LA – 15
  • Toronto – 9

It is clear that Woodstock shares much in common with the LA film festival and very little with Berlin, Newport, and San Francisco. What is hard to determine from this analysis is whether LA is where Woodstock did much of its prospecting for films or if they had similar selection criteria for films screened at the leading North American festivals (Sundance, Tribeca, SXSW, Toronto, etc…). My gut it is it is the second case, since the Los Angeles Film Festival has more much overlap with Berlin than Woodstock, which suggest that Woodstock is employing similar programming acquisition strategies to LA minus an attention to the European festival market.

Comments are off for this post

Nostalgia at Film Festivals

I have not posted for a few days while I completed an analysis for a leading carbonated beverage company. I have to pay the bills.

What I have to offer today is a look at older films shown at film festivals. Most films shown at festivals tend to be new (this year or last); however many festivals wallow in nostalgia for the past by showing older films. It is either a homage to past great films or a way to expand their audience (i.e. fill seats and sell tickets).

As I began collecting data on festivals, I began to wonder what is the present trend towards nostalgia based on what older films were showing. The following is a graphic representation of what I discovered…

Older Films at Film Festivals

The data behind this chart encompass the films shown at the Sundance, Tribeca, Toronto, Berlin, Los Angeles, and Woodstock Film Festivals. Woodstock was just added since we are showing Freeheld there this week and I want to get data on smaller festivals — not being a very nostalgic festival it added only one film to this analysis.

As you can tell, Berlin dominates nostalgic screenings and it really likes 1920, 1960, and 1990 period films. The US festivals studied focused more on the 1960s with some 1980 and 1970 films. There were some older films shown from the 1920s and 1930s in the US, but these were overshadowed by later decades. The general insight is that 1960s films are hot right now, while 1930s to 1950s films are not. I love 1940s films, but it seems they have lost their audience at festivals right now.

One thing that surprised me was how the Tribeca Film Festival shied away from 1970s movie despite their close association with Robert De Niro — who really came into his own during this period with classics such as Taxi Driver and the Deer Hunter. The gritty flavor of this era of filmmaking also fit well in my mind with New York, but this was not reflected in the Tribeca line up.

Comments are off for this post

More Thoughts on Festival Film Lengths

My work on run times for festival films has focused up until now upon what is the best length for films to get into a single festival. This isn’t the only measure of success on the festival circuit potentially related to run times.

I am pretty convinced based on antidotal evidence that it might be easier to get into a festival with a short 4-5 minute film than a longer short film, but longer short films win more awards. My question is — Is the length of shorts and features that win awards at festivals statistically different from ones that just get into the festival?

I also am interested in the question — Is there an optimal length to be scheduled in multiple festivals? Is this different than the optimal length of just getting into just one festival? I don’t know how this will play out.

I will work on these questions over the next week. This will provide a more refined understanding of the best length for festival films. I hope that we can find a series of rule of thumb for film run times that will position a filmmaker’s work to get the most and best exposure via festivals.

Comments are off for this post

« Previous PageNext Page »