PRE2019 3 Group15: Difference between revisions

From Control Systems Technology Group
Jump to navigation Jump to search
Line 1,040: Line 1,040:
| We had a meeting in which we discussed the feedback from the tutor session.
| We had a meeting in which we discussed the feedback from the tutor session.
<br/><i> 1 hour</i>
<br/><i> 1 hour</i>
| Attended the group meeting: <i>1 hour</i>
| Attended the group meeting: <i>1 hour</i><br/>Updated the API reqiest code: <i>3 hours</i>
| Attended the group meeting: <i>1 hour</i><br/>Tried to transform MATLAB functions to Java: <i>3 hours</i>
| Attended the group meeting: <i>1 hour</i><br/>Tried to transform MATLAB functions to Java: <i>3 hours</i>
| Attended the group meeting: <i>1 hour</i><br/>Rearranging the wiki and improving the display of code on the wiki: <i>45 minutes</i>
| Attended the group meeting: <i>1 hour</i><br/>Rearranging the wiki and improving the display of code on the wiki: <i>45 minutes</i>

Revision as of 09:53, 30 March 2020

Group Members

Name Study Student ID
Mats Erdkamp Industrial Design 1342665
Sjoerd Leemrijse Psychology & Technology 1009082
Daan Versteeg Electrical Engineering 1325213
Yvonne Vullers Electrical Engineering 1304577
Teun Wittenbols Industrial Design 1300148

Problem Statement and Objectives

DJ-ing is a relatively new profession. It has only been around for less than a century but has become more and more widespread in the last few decades. This activity has for the most part been executed by human beings. Current technology in the music industry has become better and better at generating playlists, or 'recommended songs' as, for example, Spotify does. Can we integrate a form of this technology into the world of DJs and create a 'robot DJ'? A robot DJ would autonomously create playlists and mix songs, based on algorithms and real-life feedback in order to entertain an audience.

How to develop an autonomous system/robot DJ which enables the user to easily use it as a substitute for a human DJ.

Users

The identification of the primary and secondary users and their needs is based on extensive literature research on the interaction between the DJ and the audience in a party setting. The reader is referred to the references section for the full articles. This section is written based on (Gates, Subramanian & Gutwin, 2006), (Gates & Subramanian, 2006) and (Berkers & Michael, 2017).

Primary users

  • Dance industry: this is the overarching organization that will possess most of the robots.
  • Organizer of a music event: this is the user that will rent or buy the robot to play at their event.
  • Owner of a discotheque or club: the robot can be an artificial alternative for hiring a DJ every night.

Primary user needs

  • The DJ-robot is more valued than a human performer.
    • The DJ-robot system provides something extraordinary and special.
    • The system is better than a human DJ in gathering information regarding audience appreciation.
  • The system's user interface is easy to understand, no experts needed.

Secondary users

  • Attenders of a music event: these people enjoy the music and lighting show that the robot makes.
  • Human DJ's: likely to "cooperate" with a DJ-robot to make their show more attractive.

Secondary user needs

  • The music set played is structured and progressive.
    • Track selection should fit the audience background.
      • The system selects appropriate tracks regarding genre.
      • The musical presentation should reflect the audience energy level.
    • There is a balance between playing familiar tracks and providing rare, new music.
      • The system selects popular tracks that are valued by the audience.
      • Similar tracks to what the audience is into are played.
    • The dancers are taken on a cohesive and dynamical music journey.


  • The audience desires control over the music being played, to a certain extent.
    • The audience wants to hear their favourite music.
    • The audience doesn't want a predictable set of music.
    • The DJ-robot takes the audience reaction into account in track selection.

The Effenaar

Since 1971 The Effenaar has developed into one of the biggest music venues of the Netherlands. Having one main hall made for 1300 people and a second hall with a capacity of 400 people. Besides their music halls and restaurant, a part of their group is working on their so called Effenaar Smart Venue. Their goal is to improve the music experience of their guests, and the artists, during their shows. In order to do this, the Smart Venue focuses on existing parts of technology. They ask themselves: what is available? And, what fits in certain shows?

For example, at a concert of Ruben Hein, they had 12 people wearing sensors measuring brainwaves, heartrates and sweat intensity on their fingertips. As seen in this video. Afterwards, they were able to see which music tracks triggered their enthusiasm and which tracks didn’t. The difference between our project and theirs is that we want to have the user feedback at the same time of the music playback. However, retrieving information this way could still be valuable. It can be measured on real people how their body reacts to certain music tracks.

Besides only focusing on music, they had several other projects. During one show they had 4 control panels set up in the hall. With these panels the guests could change the visuals of the show. Moreover, they had a project where they would make a 3D scan of guests to be used by the artist as visuals or holograms. In another experiment, they introduced an app which would raise the awareness of the importance of wearing earplugs. They used team spirit as a tool to drive this behavior, since they see sense of community as a strong value. They had a project running on feedforward based on a user database, but this was shut down due to it not being financially profitable.

The Effenaar can still benefit from an improvement of their crowd control. For example, their main hall has a bar right next to the entrance, which sometimes causes blockades. Here, technology could play a role. The crowd could be directed by the use of lighting or what songs are being played.

Our robot could help the Effenaar in their needs when focusing on music being played before and after the shows. In their second hall, the lights are already automatically adapted on the music that is being played. However the music before and after shows is controlled by an employer of the Effenaar. Most of the time this results in just playing an ordinary Spotify playlist. It would be more convenient if this could be automated and have a music set which fits to the show. They don’t see much potential in replacing the artist of a show by a robot DJ. It would rather result in a collaboration between a DJ and the technology. As said, they emphasize on the social aspect of a night out, to create a memorable moment.

Approach, Milestones, and Deliverables

Approach

The goal of the project is to create a robot that functions as a DJ and provides entertainment to a crowd. In order to reach the goal, first a literature study will be executed to find out the current state of the art regarding the problem. After enough information has been collected, an objective will be defined.

Then, the USE and technical aspects of the problem will be researched. The technical aspect-research will be implemented in a design for the robot. Based on this design a prototype will be built and programmed that is able to meet the requirements of the goal.

Milestones

In order to complete the project and meet the objectives, milestones have been determined. These milestones include:

  • A clear problem and goal have been determined
  • The literature research is finished, this includes research about
    • Users (attenders of music events, DJs, club owners)
    • The state of the art in the music (AI) industry
  • The research on how to create the DJ-software is finished
    • Ways to obtain feedback from a crowd
    • Spotify API
  • The DJ-software is created
    • Depending on what method of feedback is chosen, a sensor is also built
  • A test is executed in which the environment the software will be used in is simulated
  • The wiki is finished and contains all information about the project


Other milestones, which probably are not attainable in the scope of 8 weeks, are

  • A test in a larger environment is executed (bar, festival)
  • A full scale robot is constructed to improve the crowd’s experience
  • A light show is added in order to improve the crowd’s experience

Deliverables

The deliverables for this project are:

  • The DJ-software, which is able to use feed-forward and incorporate user feedback in order to create the most entertaining DJ-set
  • Depending on what type of user feedback is chosen, a prototype/sensor also needs to be delivered
  • The wiki-page containing all information on the project
  • The final presentation in week 8

State of the Art

The dance industry is a booming business that is very receptive of technological innovation. A lot of research has already been conducted on the interaction between DJ's and the audience and also in automating certain cognitively demanding tasks of the DJ. Therefore, it is necessary to give a clear description of the current technologies available in this domain. In this section the state of the art on the topics of interest when designing a DJ-robot are described by means of recent literature.

Defining and influencing characteristics of the music

When the system receives feedback from the audience it is necessary that it is also able to do something useful with that feedback and convert it into changes in the provided musical arrangement. The most important aspects of this arrangement are chaos, energy, tempo, danceability and valence.

Chaos is the opposite of order. Dance tracks with order can be assumed as having a repetitive rhythmic structure, contain only a few low-pitched vocals and display a pattern in arpeggiated lines and melodies. Chaos can be created by undermining these rules. Examples are playing random notes, changing the rhythmic elements at random time-points, or increasing the number of voices present and altering their pitch. Such procedures create tension in the audience. This is necessary because without differential tension, there is no sense of progression (Feldmeier, 2003).

The factors energy and tempo are inherently linked to each other. When the tempo increases, so does the perceived energy level of the track. In general, the music's energy level can be intensified by introducing high-pitched, complex vocals and a strong occurence of the beat. Related to that is the activity of the public. An increase in activity of the audience can signal that they are enjoying the current music, or that they desire to move on to the next energy level (Feldmeier, 2003). Because in general, people enjoy the procedure of tempo activation in which they dance to music that leads their current pace (Feldmeier & Paradiso, 2004).

Danceability relates to the extent to which it is easy to dance on the music. This is a feature, described by Spotify as "to what extent can people dance on it based on tempo, rhythm stability, beat strength, and overall regularity". Valence relates to the positivity (or negativity) of the song - so either positive or negative emotions.

Algorithms for track selection

One of the most important tasks of a DJ-robot - if not the most important - is track selection. In (Cliff, 2006) a system is described that takes as input a set of selected tracks and a qualitative tempo trajectory (QTT). A QTT is a graph of the desired tempo of the music during the set with time on the x-axis and tempo on the y-axis. Based on these two inputs the presented algorithm basically finds the sequence of tracks that fit the QTT best and makes sure that the tracks construct a cohesive set of music. The following order is taken in this process: track mapping, trajectory specification, sequencing, overlapping, time-stretching and beat-mapping, crossfading.

In this same article, use is been made of genetic algorithms to determine the fitness of each song to a certain situation. This method presents a sketch of how to encode a multi-track specification of a song as the genes of the individuals in a genetic algorithm (Cliff, 2006).

Receiving feedback from the audience via technology

Audiences of people attending musical events generally like the idea of being able to influence performance (Hödl et al., 2017). What is important is the way of interaction with the performers, what works well and what do users prefer?

In the article of Hödl et al. (2017), multiple ways of interacting are described, namely, mobile devices, as can be seen in the article of McAllister et al. (2004), smartphones and other sensory mechanisms, such as the light sticks discussed in the paper of Freeman (2005).

The system presented by Cliff (Cliff, 2006) already proposes some technologies that enables the crowd to give feedback to a DJ-robot. They discuss under-floor pressure sensors that can sense how the audience is divided over the party area. They also discuss video surveillance that can read the crowd's activity and valence towards the music. Based on this information the system determines whether the assemblage of the music should be changed or not. In principle, the system tries to stick with the provided QTT, however, it reacts dynamically on the crowd's feedback and may deviate from the QTT if that is what the public wants.

Another option for crowd feedback discussed by Cliff is a portable technology, more in the spirit of a wristband. One option is a wristband that is more quantitative in nature and transmits location, dancing movements, body temperature, perspiration rates and heart rate to the system. Another option is a much simpler and therefore cheaper solution. A simple wristband with two buttons on it; one "positive" button and one "negative" button. In that sense, the public lets the system know whether they like the current music or not, and the system can react upon it.

Summary of Related Research

This patent describes a system, something like a personal computer or an MP3 player, which incorporates user feedback in it's music selection. The player has access to a database and based on user preferences it chooses music to play. When playing, the user can rate the music. This rating is taken into account when choosing the next song. (Atherton, Becker, McLean, Merkin & Rhoades, 2008)


This article describes how rap-battles incorporate user feedback. By using a cheering meter, the magnitude of enjoyment of the audience can be determined. This cheering meter was made by using Java's Sound API. (Barkhuus & Jorgensen, 2008)


Describes a system that transcribes drums in a song. Could be used as input for the DJ-robot (light controls for example). (Choi & Cho, 2019)


This paper is meant for beginners in the field of deep learning for MIR (Music Information Retrieval). This is a very useful technique in our project to let the robot gain musical knowledge and insight in order to play an enjoyable set of music. (Choi, Fazekas, Cho & Sandler, 2017)


This article describes different ways on how to automatically detect a pattern in music with which it can be decided what genre the music is of. By finding the genre of the music that is played, it becomes easier to know whether the music will fit the previously played music.(De Léon & Inesta, 2007)


This describes "Glimmer", an audience interaction tool consisting of light sticks which influence live performers. (Freeman, 2005)


Describes the creation of a data set to be used by artificial intelligence systems in an effort to learn instrument recognition. (Humphrey, Durand & McFee, 2018)


This describes the methods to learn features of music by using deep belief networks. It uses the extraction of low level acoustic features for music information retrieval (MIR). It can then find out e.g. of what genre the the musical piece was. The goal of the article is to find a system that can do this automatically. (Hamel & Eck, 2010)


This article communicates the results of a survey among musicians and attenders of musical concerts. The questions were about audience interaction. "... most spectators tend to agree more on influencing elements of sound (e.g. volume) or dramaturgy (e.g. song selection) in a live concert. Most musicians tend to agree on letting the audience participate in (e.g. lights) or dramaturgy as well, but strongly disagree on an influence of sound." (Hödl, Fitzpatrick, Kayali & Holland, 2017)


This article explains the workings of the musical robot Shimon. Shimon is a robot that plays the marimba and chooses what to play based on an analysis of musical input (beat, pitch, etc.). The creating of pieces is not necessarily relevant for our problem, however choosing the next piece of music is of importance. Also, Shimon has a social-interactive component, by which it can play together with humans. (Hoffman & Weinberg, 2010)


This article introduces Humdrum, which is software with a variety of applications in music. One can also look at humdrum.org. Humdrum is a set of command-line tools that facilitates musical analysis. It is used often in for example Pyhton or Cpp scripts to generate interesting programs with applications in music. Therefore, this program might be of interest to our project. (Huron, 2002)


This article focuses on next-track recommendation. While most systems base this recommendation only on the previously listened songs, this paper takes a multi-dimensional (for example long-term user preferences) approach in order to make a better recommendation for the next track to be played. (Jannach, Kamehkhosh & Lerche, 2017)


In this interview with a developer of the robot DJ system POTPAL, some interesting possibilities for a robot system are mentioned. For example, the use of existing top 40 lists, 'beat matching' and 'key matching' techniques, monitoring of the crowd to improve the music choice and to influence people's beverage consumption and more. Also, a humanoid robot is mentioned which would simulate a human DJ. (Johnson, n.a.)


In this paper a music scene analysis system is developed that can recognize rhythm, chords and source-separated musical notes from incoming music using a Bayesian probability network. Even though 1995 is not particularly state-of-the-art, these kinds of technology could be used in our robot to work with music. (Kashino, Nakadai, Kinoshita, & Tanaka, 1995)


This paper discusses audience interaction by use of hand-held technological devices. (McAllister, Alcorn, Strain & 2004)


This article discusses the method by which Spotify generates popular personalized playlists. The method consists of comparing your playlists with other people's playlists as well as creating a 'personal taste profile'. These kinds of things can be used by our robot DJ by, for example, creating a playlist based on what kind of music people listen to the most collectively. It would be interesting to see if connecting peoples Spotify account to the DJ would increase performance. (Pasick, 2015)


This paper takes a mathematical approach in recommending new songs to a person, based on similarity with the previously listened and rated songs. These kinds of algorithms are very common in music systems like Spotify and of utter use in a DJ-robot. The DJ-robot has to know which songs fit its current set and it therefore needs these algorithms for track selection. (Pérez-Marcos & Batista, 2017)


This paper describes the difficulty of matching two musical pieces because of the complexity of rhythm patterns. Then a procedure is determined for minimizing the error in the matching of the rhythm. This article is not very recent, but it is very relevant to our problem. (Shmulevich, & Povel, 1998)


In this article, the author states that the main melody in a piece of music is a significant feature for music style analysis. It proposes an algorithm that can be used to extract the melody from a piece and the post-processing that is needed to extract the music style. (Wen, Chen, Xu, Zhang & Wu, 2019)


This research presents a robot that is able to move according to the beat of the music and is also able to predict the beats in real time. The results show that the robot can adjust its steps in time with the beat times as the tempo changes. (Yoshii, Nakadai, Torii, Hasegawa, Tsujino, Komatani, Ogata & Okuno, 2007)


This paper describes Open Symphony, a web application that enables audience members to influence musical performances. They can indicate a preference for different elements of the musical composition in order to influence the performers. Users were generally satisfied and interested in this way of enjoying the musical performance and indicated a higher degree of engagement. (Zhang, Wu, & Barthet, ter perse)

A first model

Based on the state of the art and the user needs a first model of our automated DJ system is made. We chose to depict it in a block diagram with separate blocks for the feedforward, the feedback and the algorithm itself.

First model of the automated DJ system

Feedforward

The feedforward part of our system is completely based on the user input. The user has a lot of knowledge about the desired DJ set to be played beforehand. This information is fed to the system to control it. Because this feedforward block is based on user input, it has to answer to the user needs. Below, a scheme is presented to show how the user needs relate to the feedforward parameters.

Feedforward parameters in relation to the user needs

The first part of the feedforward is the desired QTT of the set. What a QTT is, is described in the state of the art section. The ability for the user to input a QTT answers to the primary user need of an easy user interface. Providing a QTT to the system also makes it easy for the system to play a structured and progressive set of music, which is a secondary user need. This is in line with the secondary user need that dancers are taken on a cohesive and dynamical music journey. This structure in the music will also keep the audience engaged.

To fulfill the desired QTT certain tracks are delivered to the system to pick from as feedforward. These tracks are delivered via an enormous database with all kinds of music in it, such that the system has enough options to pick from in order to form the best set. In that sense, the system can pick from the audience favourite tracks that are popular and valued by the audience, which are some of our secondary user needs. Since people come to certain music events with certain expectations, the tracks to pick from should be appropriate regarding genre. That way, the audience background is taken into account and it keeps them engaged. This is also an opportunity to limit the algorithm's options for track selection, making the system more stable. Because the database the system can choose from is very large and diverse, the user need of wanting to hear rare, new music is answered. Also, this contributes to an unpredictable set of music that is not boring.

MATS/SJOERD, KAN JIJ MISSCHIEN UITEINDELIJK WAT SCHRIJVEN OVER DE BELANGRIJKSTE SPOTIFY FEATURES DIE WE GAAN GEBRUIKEN ALS FF?

In order to satisfy the audience we have to feedforward background information of the audience members to the system. In that way the system has knowledge about what the current audience is into. What someone is into could include their preferences regarding audience features based on their Spotify profile and their favourite or most hated tracks. This answers to the primary user needs of making the system something extraordinary and more valued than a human DJ because a human DJ can never know this information of every audience member. This also makes that the system is better than a human DJ at gathering information regarding audience appreciation, which is also a primary user need. It also answers to the secondary user needs of keeping balance between familiar and new music (because the system knows which tracks are familiar) and audience members wanting to express themselves - be it by providing information to a computer. It also values the DJ's desire to play similar tracks to what the audience is into and the desire to create a collective experience by means of a music set. This collective is generated because every audience member contributes a part to the feedforward of the system.

Feedback

The feedback of the system consists of sensor output. This is the part where the audience takes more control. The feedback sensor system detects how many persons are present on the dance floor, relative to the rest of the event area. This can be done by means of pressure sensors in the floor. This information cues whether the current music is appreciated or not. Another cue for appreciation of the music can be generated via active feedback of the crowd. For example, valence can be assessed by the public by means of technologies described in the state of the art section. Because audience feedback is not commonly used at music events, the primary user need of providing something special and extraordinary is answered. The ability for the crowd to give feedback answers the primary user need that the system should be better at gathering information regarding audience appreciation than a human DJ. This ability also lets the public as a whole have influence on the music, which creates a collective experience. Additionally, it answers the secondary user need of having control over the music being played. In that sense, the track selection procedure takes the audience reaction into account such that it comes up with tracks that are valued by the audience members. This also lets the public control whether the music is predictable or not, which should keep them engaged.

The other part of sensory feedback is the audience energy level. This energy level may include the activity or movement of the audience members. This can be measured passively by means of a wristband with different options for sensing activity or energy by means of heart rate, accelerometer data, sweat response or other options described in the state of the art section. The incorporation of audience energy level feedback answers to the primary user need of gathering information about the audience appreciation in a better way than a human DJ can. It also answers to the secondary user need of making the musical presentation reflect the audience energy level.

Below, a scheme is presented with all the user needs that call for feedback sensors.

User needs in relation to feedback sensors

The algorithm

Based on the feedforward and feedback of the crowd, the tracks to be played and their sequence are selected as described here.

The next step in the algorithm is overlapping the tracks in the right way. Properly working algorithms that handle this task already exist. For example, the algorithm described in (Cliff, 2006). We will describe the working principles of that algorithm in this section. The overlap section is meant to seamlessly go from one track to the other. In the described technology the time set for overlap is proportional to the specified duration of the set and the number of tracks, making it a static time interval. Alterations to the duration of this interval are made when the tempo maps of the overlapping tracks produce a beat breakdown or when the overlap interval leads to an exceedance of the set duration.

If the system wants to play a next track in a smooth transition but there is a difference between the tempo (BPM) of the current track and the next track, time-stretching and beat-mapping need to be applied. Time-stretching will slow down or bring up the tempo such that the tempo of the current and next track are nearly identical in order to produce a smooth transition to the next track. Technically speaking time-stretching is a (de)compression of time or changing the playback speed of the samples and applying proper interpolation in order to maintain sound quality. Once the tempos match, the beats of the two songs need to be aligned in order to acquire zero phase difference to avoid beat breakdown.

The last step is proper cross-fading. Although ramping down the volume of the current track while ramping up the volume of the next track is often sufficient for a good cross-fade, the algorithm described uses more sophisticated techniques to achieve proper cross-fading. The algorithm analyses the audio frequency-time spectograms of the two tracks to be cross-faded. This can be used to selectively suppress certain frequency components in the tracks such that current melodies seem to disappear and the next melody becomes more prominent. It can also filter out components of tracks that clash with each other, allowing for a smooth cross-fade (Cliff, 2006).

User interface

On the user interface (UI), the user can define the input values for different parameters that the pre-filter will use to filter the data-set.

First version UI

Figure 1: First UI containing sliders and the input values


On the UI, the input values can be defined by using sliders. The value that the slider is currently on is also displayed on the UI. Then, by clicking on the submit button, the input values are confirmed and put in a JavaScript Object, or Json file. The UI displayed in Fig. 1 is a very primitive version without any styling, which can be done in CSS. In Fig. 2 it can be seen that the correct values were put into the Json file.


Json file containing parameters

Figure 2: The Json file containing the correct values for the parameters

Pre-filter

Excitement prediction with multiple regression

In order to come up with a useable product we need to narrow down the scope of this project. We decided to focus on engineering a proper pre-filter and feedforward system for track selection, based on the Spotify audio features. We mainly focus on the features "energy" and "valence". Even after extensive research, no formal definition or equation was found for the Spotify feature "energy". Spotify itself describes it as a perceptual measure of intensity and activity based on dynamic range, perceived loudness, timbre, onset rate, and general entropy. It is a floating point number with a range between 0 and 1. The same holds for the "valence" feature; no equation can be found but Spotify describes it as a representation of the positivity or negativity of a track. It is a floating point number with a range of 0 to 1 where values close to 1 represent positive songs, whereas values close to 0 represent sad songs.

Continuing, we rated 152 songs on "excitement" and then performed a multiple regression analysis with the Spotify features "energy" and "valence" to come up with a prediction equation for "excitement" based on these features. We rated "excitement" such that it represents how enthusiastic the song is, how excited one gets by listening to it. It is a floating point number ranging from 0 to 1. Values close to 0 represent songs that will not get people enthusiastic, whereas values close to 1 represent songs that are very exciting and fosters enthusiasm among listeners. Please not that for now the feature "excitement" is completely made up by ourselves and inherently subjective in nature. However, we deemed "excitement" as it is defined now a good parameter for track selection that makes the audience happy. We picked songs from three different genres of dance music: Techno, Hardstyle and Disco. We wanted to stick to dance music, but to diversify our research we considered three distinct genres. The results of the regression analyses are presented in the proceeding sections.

Multiple regression per genre

The first step in our analysis was to do a multiple regression for every distinct genre in our database - being either Techno, Hardstyle or Disco - to see whether the equation for "excitement" differs between genres and whether it gives any significant results to start with.

Multiple regression techno

The multiple regression model considering techno is the only one which turned out significant, R² = 0.1, F(2, 60) = 3.39, p = 0.04. Here, "excitement" was based on the Spotify features "energy" and "valence". It means that for techno the appropriate equation for excitement is the following: ex = 0.39 + 0.183*en + 0.208*v. Where "ex" is excitement, "en" is energy and "v" is valence. The exact results are presented in the table.

excitement Coefficient Standard Error t p
energy 0.183 0.116 1.57 0.121
valence 0.208 0.096 2.11 0.039
constant 0.390 0.100 3.91 0.000

Multiple regression hardstyle

The model for hardstyle turned out non-significant, R² = 0.05, F(2, 37) = 0.91, p = 0.41. It means that with this sample of hardstyle songs and their excitement ratings, no fitting equation is found by linear regression. The exact results are presented in the table.

excitement Coefficient Standard Error t p
energy -0.171 0.154 -1.11 0.276
valence -0.057 0.063 -0.91 0.371
constant 0.800 0.142 5.61 0.000

Multiple regression disco

The model for disco turned out non-significant, R² = 0.1, F(2, 46) = 2.55, p = 0.09. It means that with this sample of disco songs and their excitement ratings, no fitting equation is found by linear regression. The exact results are presented in the table.

excitement Coefficient Standard Error t p
energy 0.079 0.109 0.72 0.474
valence -0.142 0.077 -1.86 0.070
constant 0.693 0.118 5.86 0.000

Multiple regression across genres

In the next step, we took all genres together and performed the same multiple regresion analysis on that dataset. This regression turned out significant, R² = 0.06, F(2, 149) = 4.99, p = 0.008. It means that there is a linear equation to predict "excitement" of a track based on the audio features "energy" and "valence" if you consider the three dance genres together. This equation is as follows: ex = 0.45 + 0.16*en + 0.07*v. If we generated this new predicted value from the equation, took the absolute difference between this predicted value and the "real" value for "excitement", the mean difference was 0.07 with a standard deviation of 0.06. All exact results can be found in the table below.

excitement Coefficient Standard Error t p
energy 0.160 0.070 2.28 0.024
valence 0.070 0.024 2.93 0.004
constant 0.450 0.063 7.13 0.000

Discussion of the regression results

When we took each genre separately to come up with a prediction equation for "excitement", the results were not always promising. Only the regression for techno turned out significant. We think that this was mainly due to the small sample size (only 40, 49 and 63 songs used in every list) and the inherent subjective nature of the feature "excitement". Only one person rated this feature for every song making it a very subjective, personal variable. However, we should look beyond the results of this regression analysis only. In the future when our system will be used at large music events, the variable "excitement" can be deduced in a much better way than by quantifying the subjective opinion of one person. One option could be to let every attender of a music festival rate about 10 relevant songs before attending the event. If 10,000 people do this the system can create 100,000 data points to base the regression on, instead of 152. One can imagine that this would give much better results than the provided analysis. Besides, when considering the regression analysis across genres we already came up with a model that has an average error of only 7%. One can imagine how small the error would become if 100,000 data points are used.

Another option could be to generate the "excitement" information in a more physical sense. For example based on information gathered from heart rate monitors and/or brain activity sensors as used by the Effenaar. This information is more quantitative in nature than people's subjective opinion on "excitement" level of a track. So, maybe this could function as input that generates a more robust equation for "excitement".

Concluding, how the information on "excitement" is gathered is a point of discussion as well as an opportunity for improvement. What is most important is that a multiple regression model might be a good way to incorporate feedforward for track selection in our system. In that sense, the value of this particular research is not in the results of these regressions but in the method behind it.

Excitement matching with QET

For the excitement matching algorithm we want a module that takes in a QET from the user and transforms it into a mathematical equation, because a graph is what humans can work with, while a equation is what computers can work with. Because we want to answer to the user need of keeping the system simple and easy to use, we decided to let the user input the QET by means of "key points of excitement" in the music set. This is easier than drawing a whole graph. Thus, the user can specify at what points in time it desires certain values of excitement and the system does the rest. We created a MATLAB script to simulate such software and prove its working.

A script for QET matching algorithm

As an example, we chose as user input a 30 minutes long music set, divided in 1 minute intervals. In the example case the user inputs the following desires: start the set at an excitement of 0.7, after 5 minutes drop to 0.6 within 5 minutes, continuing climb to 0.8 within 10 minutes, then drop to 0.7 again within the next 5 minutes and finish of the set by climbing to the maximum excitement of 1.00 in the last 5 minutes. This input from the user is read from a ".txt" file. This file is transformed to data points and depicted below.


Key data points from the user

The next step is to interpolate for values between these key excitement points (in order to get a value for every minute of the set). We chose to use simple linear interpolation which gave decent results, however other sorts of interpolation are also possible and easily implemented using MATLAB. After interpolation, a polynomial fit is applied in order to come up with a mathematical equation for the QET. In this example a 7th order polynomial was created to describe the QET. The result is depicted below.

Interpolation result QET

One can see that the result is quite promising. The polynomial curve follows the desired QET without an undesirable large error. The coefficients of the polynomial curve are also outputted by this script and can thus be used to construct a mathematical equation for the QET. This equation can then again be sampled (with a sampling period depending on the desired song length) to obtain the distinct excitement values the songs should have in time. In that way, the system can construct and output an ordering of the provided tracks based on this sampled equation for QET and the algorithm works. These tracks are depicted as the orange bars in the graph. In this example case a duration of 2 minutes is chosen for every song such that a lot of music can be played while there is time enough for every song to play its core part. The output of the system is a ".txt" file that contains the excitement ratings of the tracks in the generated set from the user input. The MATLAB script that does all these calculations is provided below.

%Input the key data points (degree of polynomial should be < data points)
QET_matrix = readmatrix('sample_input.txt'); %First row is time, Second row is excitement
x_keydata = QET_matrix(1,:); %The time points from the user input file
y_keydata = QET_matrix(2,:); %The corresponding excitement ratings
p_keydata = polyfit(x_keydata,y_keydata,3); %Get polynomial coefficients for the key data points
f_keydata = polyval(p_keydata,x_keydata); %Generate a fucntion from coefficients
figure;
plot(x_keydata,y_keydata,'o',x_keydata,f_keydata,'-'); %Not a desirable outcome -> use interpolation
title("Input of key data points and polynomial fit");
xlabel("Time [m]");
ylabel("Excitement rating [0,1]");

%Interpolation for the key data points
samples = max(x_keydata); %#samples to interpolate
x_interpolation = 1:samples;
for i = 1:samples
    y_interpolation(i) = interp1(x_keydata,y_keydata,i); %Linear interpolation between input points
end

%Get a formula for the graph
n = 7; %Degree of the polynomial
p_polynomial = polyfit(x_interpolation,y_interpolation,n); %Get polynomial coefficients for the interpolated data
f_polynomial = polyval(p_polynomial,x_interpolation); %Make a function from these coefficients

%Create discrete bars (the songs)
songlength = 2; %In minutes
for i = 1:songlength:length(f_polynomial)
    track_histogram(i) = f_polynomial(i); %Generate the discrete tracks
end
figure;
hold on;
plot(x_interpolation,y_interpolation,'o',x_interpolation,f_polynomial,'-'); %Plot graphs
alpha(bar(track_histogram,songlength),0.1); %Plot bars
ylim([0.6 1.05]);
title("Interpolation result and polynomial fit");
xlabel("Time [m]");
ylabel("Excitement rating [0,1]");
legend("Interpolated values", "Polynomial curve", "Tracks on the set");

%Output the formula coefficients
fprintf("Fitted formula coefficients (a1*x^n + a2*x^(n-1) + ...): \n");
for i = 1:n+1
    fprintf("%d \n",p_polynomial(i)); %Prints the coefficients
end

%Output the tracks' excitement rating to a text-file
fileID = fopen('sample_output.txt','w');
fprintf(fileID,"%1.3f\r\n",track_histogram);
fclose(fileID);

A GUI for QET matching

Given the time budget it was unfeasible to integrate the QET matching algorithm in the Java-based website. Therefore, we decided to let it function as a standalone application that comes as an extra to our system. Because the algorithm is of value to the project as a whole we worked further on it and created an executable GUI for it. The working principles are exactly the same as described in the prior sections, but now it is easier for the user to put in his or her desired QET via a visual tool. A screenshot of the tool is provided below.

A graphical interface for QET matching

If the user follows through the numbered steps it can totally control the desired QET of his or her music set. In the first step the user defines the duration of the music set in minutes. In the next step the user defines the number of time points he or she wants to control - the depth of control so to speak. In the third step the user makes use of a slider to set the desired excitement values of all the time points. If the user makes any mistakes in this, it is possible to go back to previous points. In the last step the user defines the length that he or she wants to hear all songs and then presses "Generate matching values". When this button is pushed the user gets to see the graph with the interpolation result, polynomial fit and bar graph of the tracks - as described and displayed in the prior section. Also, this bar graph of the tracks and their excitement values is transformed into data (an array) and outputted to a ".txt" file so the rest of the system could work with it.

By providing the user a visual tool for deciding on a QET, the system answers to the primary user need of an easy to understand user interface for which no experts are needed to operate it. Besides, it answers to the secondary user needs of a structured and progressive music set and that dancers are taken on a cohesive and dynamical music journey.

System overview

UNDER CONSTRUCTION?

Below a graphic overview of the system is given. Important to note is that the grey area is the scope of our project; to design the rest is not feasible within our time budget. Besides, a lot of research is already done on the modules outside the grey region. Mixing tracks together seamlessly is a desirable skill for all DJ's and therefore a skill for which a lot of DJ's seek help in the form of technology. Due to this high demand, a lot of properly working mixing algorithms already exist, for example (Cliff, 2006). Certain feedback sensors that measure excitement of the audience also already exist. A lot of wearable devices that measure all sorts of things like heart rate or motion already exist and have been used. One can look at one of the prior sections to gain information on this. Another real life example is the Effenaar that has used technology to measure brain activity and excitement among attenders of a music event. This lets us make the assumption that the mixing and the feedback part will work.

Globally, the system works as follows. The system has a large database to pick music from. This database contains for every song an "excitement" value that is based on a multiple regression output of the Spotify audio features "valence" and "energy". The database is freely available to the user. The music of this database is let through the pre-filter. The pre-filter filters the songs in the database on genre and SPOTIFY FEATURES (NOG BESLISSEN WELKE) based on the user input that is defined by the user interface. In that sense, the pre-filter puts out an array of tracks that is already filtered according to the user's desire. Next, the filtered tracks will be matched according the desired QET (we will use an excitement graph instead of a tempo graph but the principle is the same). The result of this matching is a sequence of tracks that creates the playlist to be played for that evening (or morning?). This playlist is also freely available to the user. The playlist is then fed to the mixing algorithm to make sure that the system outputs a nicely mixed set of music to enjoy. This music is rated on excitement by means of feedback sensors. This feedback is used to update the playlist via the excitement matching module. Thus, if the audience is not happy with the currently playing music the system can act upon that.

System overview

Validation of the user needs

Primary user needs

User need Validation method(s)
The DJ-robot is more valued than a human performer.
  • Blind testen naast een menselijke DJ, als zijnde een turing test
    • Cafe/disco, De ene avond een menselijke DJ laten spelen en een andere avond de robot, maar laten lijken alsof een mens de plaatjes draait.
    • Gasten de muziekbeleving laten beoordelen. Gemiddeld aantal gasten per uur tellen.
    • Dit meerdere avonden doen om statistisch significant resultaat te krijgen
  • Silent disco met meerdere kanalen, 1 met de robot-DJ, 1 met echte DJ (wederom blind)
    • Om de zoveel tijd tellen welke kanalen beluisterd worden
    • Dit meerdere avonden doen om statistisch significant resultaat te krijgen
  • Een groep DJ’s een muziek set laten maken. De robot eenzelfde set laten maken. Dus zelfde genre, zelfde QET etc.
    • Een groep willekeurige mensen de sets rangschikken van goed naar slecht.
The system's user interface is easy to understand, no experts needed.
  • Usability testing

Secondary user needs

User need Validation method(s)
  • The music set played is structured and progressive.
  • Track selection should fit the audience background.
    • The system selects appropriate tracks regarding genre.
    • The musical presentation should reflect the audience energy level.
  • There is a balance between playing familiar tracks and providing rare, new music.
    • The system selects popular tracks that are valued by the audience.
    • Similar tracks to what the audience is into are played.
  • The dancers are taken on a cohesive and dynamical music journey.
  • The audience desires control over the music being played, to a certain extent.
    • The audience wants to hear their favourite music.
    • The audience doesn't want a predictable set of music.
    • The DJ-robot takes the audience reaction into account in track selection.
  • Not in the scope of our project.

Who's Doing What?

Personal Goals

The following section describes the main roles of the teammates within the design process. Each team member has chosen an objective that fits their personal development goals.

Name Personal Goal
Mats Erdkamp Play a role in the development of the artificial intelligence systems.
Sjoerd Leemrijse Gain knowledge in recommender systems and pattern recognition algorithms in music.
Daan Versteeg
Yvonne Vullers Play a role in creating the prototype/artificial intelligence
Teun Wittenbols Combine all separate parts into one good concept, with a focus on user interaction.

Weekly Log

Based on the approach and the milestones, a planning has been made. This planning is not definite and will be updated regularly, however it will be a guideline for the coming weeks.


Week 2 Goal: Do literature research, define problem, make a plan, define users, start research into design and prototype

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 10-02 We formed a group and discussed the first possibilities within the project, chose a general theme and started doing research. We attended the tutor session.


30 minutes + 30 minutes

Work on SotA and evaluate design options Attended meeting with tutors: 30 minutes
Elaborated the notes of the meeting: 30 minutes
Reading and summarizing scientific literature on the interaction between DJ and crowd: 3 hours
Attended meeting with tutors: 30 minutes Attended tutor meeting


0.5 hours

Started doing literature research and summarized Pasick (2015) & Johnson. Formed a group and attended meeting.


2 hours + 30 minutes + 30 minutes

Tuesday 11-02 Searching scientific literature on user requirements of the public and the DJ at a party (Gates, Subramanian & Gutwin, 2006), (Gates & Subramanian, 2006): 1 hour
Wednesday 12-02 Summarizing scientific literature on user requirements: 3 hours Started looking for papers about user interaction and user feedback


2 hours

Thursday 13-02 We had a meeting in which we discussed the feedback from the tutor session, discussed the research and formed a more detailed and specific plan for the project.


1.5 hours

Meeting with group members, discussing who will be doing what the coming week: 1.5 hours Meeting with group members, discussing who will be doing what the coming week. Emailed Effenaar.


1.5 hours

Did some more research on audience interaction and summerized it. (Hödl, Fitzpatrick, Kayali & Holland, 2017)(Zhang, Wu, & Barthet, ter perse) I, also updated the wiki, made the planning more clear and divided the references of the SotA. Attended the meeting.


1.5 hours + 45 minutes + 1.5 hours

Friday 14-02 Developed access to spotify API 4 hours. Researched different forms of audience interaction and added it to the SotA. #Receiving feedback from the audience via technology


1.5 hours

Saturday 15-02 Created data set from spotify API: 8 hours Writing the "Users" section based on my prior literature research: 2 hours
Updating the section on state of the art based on my prior literature research: 2 hours
Sunday 16-02 Updating the lay-out of the wiki: 1 hour Updating the milestones and deliverables. Continued looking for papers on incorporating user feedback and user feedback for music events. Summarized papers (Barkhuus & Jorgensen, 2008) & (Atherton, Becker, McLean, Merkin & Rhoades, 2008)


2.5 hours


Week 3 Goal: Continue research, start on design

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 17-02 Group meeting and general discussion: 45 minutes Attended tutor meeting: 30 minutes. Discussion with group members: 15 minutes. Attended tutor meeting: 30 minutes. Discussion with group members: 15 minutes. Attended tutor meeting: 30 minutes. Discussion with group members: 15 minutes. Attended tutor meeting:


45 min

Tuesday 18-02
Wednesday 19-02 Worked on a first concept model of our designed system, A first model. 4 hours Elaborated the notes of the meeting: 30 minutes
Thursday 20-02 Group meeting, discussing plans and everyone's contributions: 1 hour Group meeting, discussed plans 1 hour Attended group meeting: 1 hour. Worked on how the user needs lead to a first model: 2 hours Attended group meeting: 1 hour Attended group meeting: 1 hour Attended group meeting:


1 hour

Friday 21-02
Saturday 22-02
Sunday 23-02


Carnaval break

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 24-02 Worked on a schematic to show how the user needs relate to our design: 1.5 hours
Tuesday 25-02
Wednesday 26-02
Thursday 27-02 Worked on a model to predict music parameters using the Spotify dataset in a multiple regression model: 2 hours Get to know the basics of node.js


1.5 hours

Friday 28-02 Worked on getting to know javascript, node.js, and visual studio: 6 hours.
Saturday 28-02
Sunday 1-03 Started integration of data set in Tempo curve algorhithm 4 hours. Did research on the Effenaar: 30 minute
Elaborating on the algorithm in the first model: 2 hours
Continued on getting to know javascript and node.js. Also started on getting to know express JS: 4 hours.


Week 4 Goal: Finish first design, start working on software.

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 02-03 We attended the tutor session and went to the Effenaar in order to get some general information about the possibilities, and or user needs.


30 minutes + 1 hour

Attended the tutor meeting: 30 minutes
Had a conversation with the Effenaar: 1 hour
Attended the tutor meeting: 30 minutes
Had a conversation with the Effenaar: 1 hour
Worked out the notes on the conversation with Effenaar: 30 minutes
Attended the tutor meeting: 30 minutes
Had a conversation with the Effenaar: 1 hour
Attended the tutor meeting: 30 minutes
Tuesday 03-03 Worked on explaining the feedforward and feedback parameters more clearly in the concept model: 2 hours
Wednesday 04-03 Did research on the Spotify audio features, tried to come up with exact definitions: 2 hours
Thursday 05-03 We had a meeting in which we discussed the feedback from the tutor session, discussed the research and formed a more detailed and specific plan for the project.


1.5 hours

Attended the group meeting: 1.5 hours Attended the group meeting: 1.5 hours Attended the group meeting: 1.5 hours Attended the group meeting: 1.5 hours Attended the meeting


1.5 hours

Friday 06-03 Cleaned up data set generation code 2 hours
Saturday 07-03 Added new tags to data set & included pre-filtering backend. 7 hours Worked on a multiple regression model for feedforward: 4 hours
Sunday 08-03 Finalized pre-filtering backend + made API calls more reliable 4 hours Processed the results of the multiple regression analysis and added them to the wiki: 3 hours Worked more on node.js, expressJS and the UI: 6 hours


Week 5 Goal: Work on software

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 09-03 We attended the tutor session
30 minutes
Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes
Tuesday 10-03
Wednesday 11-03 Try to specify/quantify "excitement" and search for data on it: 2 hours Elaborated meeting notes: 30 minutes
Thursday 12-03 We had a meeting in which we discussed the feedback from the tutor session, formed a more detailed and specific plan for the project.


1 hour

Attended the group meeting: 1 hour Attended the group meeting: 1 hour Attended the group meeting: 1 hour Attended the group meeting: 1 hour
Friday 13-03 Ideated extra backend implementations: 1 hour Creating a graphical system overview: 2 hours Worked on a more general grouping of the different genres and worked on the visualization of a part of the back end 45 minutes
Saturday 14-03 Describing the system overview on the Wiki: 2 hours Started working on interview with the Effenaar: 1 hour Worked on getting to know Bootstrap JS for creating the UI: 2 hours
Sunday 15-03 Worked on a QET matching algorithm: 2.5 hours Worked on learning parts that are useful for creating the UI & a little bit on the UI itself: 5 hours


Week 6 Goal: Finish software , do testing

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 16-03 We attended the tutor session
30 minutes
Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes
Tuesday 17-03 Worked further on QET track ordering: 2 hours
Looked into Java MATLAB connection: 1 hour
Worked on UI 2 hours
Wednesday 18-03 Looked into Webflow. 2 hours Worked out interview with the Effenaar: 2 hours
Thursday 19-03 Worked on an interface for the QET, now works with .txt files: 2 hours
Learned gitHub: 1 hour
Worked on UI 5 hours
Friday 20-03 We had a meeting in which we discussed the feedback from the tutor session, and worked on setting up GitHub.


1.5 hours

Attended group meeting 1.5 hours Attended the group meeting: 1.5 hours Attended the group meeting: 1.5 hours, Looked at/trying to understand Mats's improvements on UI 30 minutes Attended the group meeting: 1.5 hours
Saturday 21-03 Improved frontend code, took a Vue,js class 5 hours
Sunday 22-03 Searched for frontend CSS design & Improved code 4 hours Added new inputs and a value display to the UI 2 hours Worked on the user interface using the code Yvonne made. I changed the HTML slightly and added CSS in Adobe Dreamweaver. 2 hours



Week 7 Goal: Finish up the last bits of the software

Group Mats Erdkamp Sjoerd Leemrijse Daan Versteeg Yvonne Vullers Teun Wittenbols
Monday 23-03 We attended the tutor session
30 minutes
Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes Attended the tutor meeting: 30 minutes
Tuesday 24-03 Backend development + CSS slider development 7 hours Did research on MATLAB Compiler for Java application: 3 hours
Wednesday 25-03 Learned CSS + started integration of backend 6 hours
Thursday 26-03 We had a meeting in which we discussed the feedback from the tutor session.


1 hour

Attended the group meeting: 1 hour
Updated the API reqiest code: 3 hours
Attended the group meeting: 1 hour
Tried to transform MATLAB functions to Java: 3 hours
Attended the group meeting: 1 hour
Rearranging the wiki and improving the display of code on the wiki: 45 minutes
Attended the group meeting: 1 hour Attended the group meeting: 1 hour
Friday 27-03 Worked on a GUI for QET matching: 2 hours Scanning through the wiki on unsubstantiated choices and other nitpicking: 2 hours Worked on the UI 6 hours
Saturday 28-03 Worked on GUI for QET matching: 3 hours
Sunday 29-03 Reported my progression on the wiki and editing existing pages: 2 hours Scanning through the wiki on unsubstantiated choices and other nitpicking: 1.5 hour
Starting on validation of user needs: 1.5 hour
Worked on the UI 5 hours


Week 8 Goal: Finish wiki, presentation

References


Atherton, W. E., Becker, D. O., McLean, J. G., Merkin, A. E., & Rhoades, D. B. (2008). U.S. Patent Application No. 11/466,176.


Barkhuus, L., & Jørgensen, T. (2008). Engaging the crowd: studies of audience-performer interaction. In CHI'08 extended abstracts on Human factors in computing systems (pp. 2925-2930).


Berkers, P., & Michael, J. (2017). Just what makes today’s music festivals so appealing?.


Choi, K., Cho, K. “Deep Unsupervised Drum Transcription”, 20th International Society for Music Information Retrieval Conference, Delft, The Netherlands, 2019.


Choi, K., Fazekas, G., Cho, K., & Sandler, M. (2017). A tutorial on deep learning for music information retrieval. arXiv preprint arXiv:1709.04396.


Cliff, D. (2006). hpDJ: An automated DJ with floorshow feedback. In Consuming Music Together (pp. 241-264). Springer, Dordrecht.


De León, P. J. P., & Inesta, J. M. (2007). Pattern recognition approach for music style identification using shallow statistical descriptors. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 37(2), 248-257.


Feldmeier, M. C. (2003). Large group musical interaction using disposable wireless motion sensors (Doctoral dissertation, Massachusetts Institute of Technology).


Feldmeier, M., & Paradiso, J. A. (2004, April). Giveaway wireless sensors for large-group interaction. In CHI'04 Extended Abstracts on Human Factors in Computing Systems (pp. 1291-1292).


Freeman, J. (2005) Large Audience Participation, Technology, and Orchestral Performance in Proceedings of the International Computer Music Conference, 2005, pp. 757–760.


Gates, C., & Subramanian, S. (2006). A Lens on Technology’s Potential Roles for Facilitating Interactivity and Awareness in Nightclub. University of Saskatchewan: Saskatoon, Canada.


Gates, C., Subramanian, S., & Gutwin, C. (2006, June). DJs' perspectives on interaction and awareness in nightclubs. In Proceedings of the 6th conference on Designing Interactive systems (pp. 70-79).


Greasley, A. E. (2017). Commentary on: Solberg and Jensenius (2016) Investigation of intersubjectively embodied experience in a controlled electronic dance music setting. Empirical Musicology Review, 11(3-4), 319-323.


Humphrey, E.J., Durand, S., McFee, B. “OpenMIC-2018: An open dataset for multiple instrument recognition”, 19th International Society for Music Information Retrieval Conference, Paris, France, 2018.


Hamel, P., & Eck, D. (2010, August). Learning features from music audio with deep belief networks. In ISMIR (Vol. 10, pp. 339-344).


Hödl, Oliver; Fitzpatrick, Geraldine; Kayali, Fares and Holland, Simon (2017). Design Implications for TechnologyMediated Audience Participation in Live Music. In: Proceedings of the 14th Sound and Music Computing Conference, July 5-8 2017, Aalto University, Espoo, Finland pp. 28–34.


Hoffman, G., & Weinberg, G. (2010). Interactive Jamming with Shimon: A Social Robotic Musician. Proceedings of the 28th of the International Conference Extended Abstracts on Human Factors in Computing Systems, 3097–3102.


Huron, D. (2002). Music information processing using the Humdrum toolkit: Concepts, examples, and lessons. Computer Music Journal, 26(2), 11-26.


Jannach, D., Kamehkhosh, I., & Lerche, L. (2017, April). Leveraging multi-dimensional user models for personalized next-track music recommendation. In Proceedings of the Symposium on Applied Computing (pp. 1635-1642).


Johnson, D. (n.a.) Robot DJ Used By Nightclub Replaces Resident DJs. Retrieved on 09-02-2020 from http://www.edmnightlife.com/robot-dj-used-by-nightclub-replaces-resident-djs/


Kashino, K., Nakadai, K., Kinoshita, T., & Tanaka, H. (1995). Application of Bayesian probability network to music scene analysis. Computational auditory scene analysis, 1(998), 1-15.


McAllister, G., Alcorn, M., Strain, P. (2004) Interactive Performance with Wireless PDAs Proceedings of the International Computer Music Conference, 2004, pp. 1–4.


Pasick, A. (21 December 2015) The magic that makes Spotify's Discover Weekly playlists so damn good. Retrieved on 09-02-2020 from https://qz.com/571007/the-magic-that-makes-spotifys-discover-weekly-playlists-so-damn-good/


Pérez-Marcos, J., & Batista, V. L. (2017, June). Recommender system based on collaborative filtering for spotify’s users.In International Conference on Practical Applications of Agents and Multi-Agent Systems (pp. 214-220). Springer, Cham.


Shmulevich, I., & Povel, D. J. (1998, December). Rhythm complexity measures for music pattern recognition. In 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No. 98EX175) (pp. 167-172). IEEE.


Wen, R., Chen, K., Xu, K., Zhang, Y., & Wu, J. (2019, July). Music Main Melody Extraction by An Interval Pattern Recognition Algorithm. In 2019 Chinese Control Conference (CCC) (pp. 7728-7733). IEEE.


Yoshii, K., Nakadai, K., Torii, T., Hasegawa, Y., Tsujino, H., Komatani, K., Ogata, T. & Okuno, H. G. (2007, October). A biped robot that keeps steps in time with musical beats while listening to music with its own ears. In 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 1743-1750). IEEE.


Zhang, L., Wu, Y., & Barthet, M. (ter perse). A Web Application for Audience Participation in Live Music Performance: The Open Symphony Use Case. NIME. Retrieved from https://core.ac.uk/reader/77040676