https://cstwiki.wtb.tue.nl/api.php?action=feedcontributions&user=S126005&feedformat=atomControl Systems Technology Group - User contributions [en]2024-03-29T13:11:37ZUser contributionsMediaWiki 1.39.5https://cstwiki.wtb.tue.nl/index.php?title=Week_1&diff=32125Week 12017-02-03T11:44:24Z<p>S126005: /* In hoeverre is op dit moment de techniek ontwikkeld om robots te laten spreken? */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
=Bronnen zoeken=<br />
We vinden het interessant om ons te richten op de spraak van een robot. Robots praten tot nu toe nog erg monotoon en wij denken dat het een verbetering zou zijn om robots een emotie te geven in hun stem. Om hier een specifiekere richting in te kiezen gaan we informatie hierover verkennen. <br />
<br />
De onderste vier vragen zijn een aantal vragen die een richting aangeven. Het doel is om te kijken welke richting we het beste op kunnen gaan met ons onderzoek. Hierbij kijken we naar hoeveel informatie er al is en in hoeverre wij hier zelf iets mee kunnen.<br />
<br />
<br />
===In hoeverre is op dit moment de techniek ontwikkeld om robots te laten spreken?===<br />
<br />
<br />
Belangrijk begrip: Dysarthria = difficulty of speaking due to ALS<br />
<br />
=====“High energy efficiency biped robot controlled by the human brain for people with ALS disease.”=====<br />
<br />
Als oplossing voor ALS gaat dit artikel dieper in op BCI (computer verbinden met hersenen door hersenactiviteit te meten met EEG) en menselijke robots als assistent. Hierdoor kunnen robotic devices bestuurd worden door de hersenen. Over spraak wordt alleen gezegd dat er in de toekomst een versie van BrainControl (eerste BCI die gebruikt wordt door mensen die geen spieren kunnen bewegen, maar nog wel ‘bewust’ zijn) zal komen waarbij text-to-speech een functie is.<br />
<br />
=====“A Smart Interaction Device for Multi-Modal Human-Robot Dialogue”=====<br />
<br />
Smart Interaction Device (SID) is een robot die een dialoog aan kan gaan met een gebruiker. Soar wordt gebruikt in het SID-systeem voor vastgestelde regels om te beredeneren.<br />
<br />
=====“Programmable Interactive Talking Device”=====<br />
<br />
Technisch verslag over een apparaat dat tekst (of andere digitale input) kan omzetten in geluid (spraak).<br />
<br />
=====https://www.apple.com/accessibility/ios/voiceover/=====<br />
<br />
Apple gebruikt een functie om blinde mensen ook hun producten te kunnen laten gebruiken. Apple-producten hebben de functie om alles ‘voor te lezen’.<br />
<br />
=====“Nao Key Feature Audio Signal Processing”=====<br />
<br />
Dit artikel beschrijft hoe de audio modules zijn georganiseerd in de Nao robot. Bijvoorbeeld hoe je data naar de speakers van Nao moet sturen.<br />
<br />
===Welke technieken zijn er op dit moment in ontwikkeling voor het 'schrijven' met de ogen en de hersenen?===<br />
<br />
<br />
====="EyeBoard: A Fast and Accurate Eye Gaze-BasedText Entry System"=====<br />
Proposes a new interface for dwell-free eye-writing. <br />
<br />
====="The potential of dwell-free eye-typing for fast assistive gaze communication"=====<br />
Gaze communication systems have been researched for over 30 years. [Majaranta and Raih¨ a 2002 ¨ ]<br />
<br />
Earlier technique: eye-typing = if you stare at a letter as long as the preset dwell-time out then the system assumes you want to type the letter. Findings were between 7-20 wpm. [Majaranta and Raih¨ a 2002 ¨ ; Majaranta et al. 2009; Wobbrock et al.2008; Tuisku et al. 2008; Ward and MacKay 2002]<br />
Other fast technique: Dasher = works with boxes that each represent a letter. The larger the box the more probable it is that the letter is chosen. <br />
New proposed technique: dwell-free eye-typing = swyping with your eye and the system tries to figure out what you meant.<br />
<br />
====="Writing with Your Eye: A Dwell Time Free Writing System Adapted to the Nature of Human Eye Gaze"=====<br />
<br />
Problems with eye tracking software:<br />
accuracy is limited to 0.5-1.0 degrees of an angle.<br />
delay dependent on the frequency<br />
jitters and tremors make it difficult to point the eye.<br />
'Midas touch problem’ when something else is attractive the eye moves to that.<br />
<br />
====="Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans."=====<br />
<br />
Artikel about non invasive cursor movement. <br />
<br />
====="Neural Signal Based Control of the Dasher Writing System"=====<br />
<br />
Writing with Dasher and send signals via EEG.<br />
Big advantage: no muscle control is needed, this prevents pain and a lack of precision. <br />
<br />
====="Language Model Applications to Spelling with Brain-Computer Interfaces"=====<br />
<br />
Ways of spelling by using different BCI techniques.<br />
<br />
<br />
<br />
===In hoeverre is het mogelijk om emoties te koppelen aan de spraak van robots?===<br />
<br />
<br />
=====“Emotions in the voice: humanizing a robotic voice”=====<br />
<br />
The most important characteristics of the emotions sad, anger and happiness are evaluated. Those characteristics are used for the speech of a robot. A group of people have to detect which kind of emotion is used by the robot.<br />
<br />
<br />
<br />
===In hoeverre is het mogelijk om van een mensenstem een gesynthetiseerde stem te maken die net zo klinkt als de opgenomen mensenstem?===<br />
<br />
Het opnemen van een mensenstem, en hiervan een gesynthetiseerde stem maken die net zo klinkt als de opgenomen stem wordt ook wel voice cloning genoemd. <br />
<br />
Er zijn verschillende bedrijven en instanties die onderzoek doen naar of gebruik maken van voice cloning:<br />
<br />
- Cereproc<br />
<br />
Dit bedrijf maakt gebruik van voice cloning. Voor het creeëren van je eigen stem hebben zij minimaal 40 minuten geluidsopnamen nodig. De geluidsopnamen moeten aan allerlei eisen voldoen, waaronder bijvoorbeeld dat er geen andere geluiden aanwezig moeten zijn en dat de opnamen van hoge kwaliteit moeten zijn. Daarnaast moet de stem op elke opname zoveel mogelijk hetzelfde klinken, er moet zo weinig mogelijk variatie zijn in snelheid, toonhoogte, volume enz. Voor de voice cloning maakt het bedrijf gebruik van HTS voices. <br />
<br />
=====The HMM-based speech synthesis system (HTS) version 2.0=====<br />
<br />
- EUAN MacDonald Centre<br />
<br />
Werkt samen met de universiteit van Edinburgh. Samen zijn ze bezig met onderzoek naar stemopnames en artificiël stemgeluid met ‘persoonlijke touch’ voor MND (ALS) patiënten. Met behulp van een stemopname van een patient en ‘donorstemmen’ kan een artificiële stem worden gemaakt. Hiervoor zijn 400 zinnen van de patiënt nodig. De zinnen die geselecteerd zijn bevatten alle klanken van de Engelse taal in alle mogelijke combinaties.<br />
<br />
Bij voice cloning komen verschillende ethische kwesties kijken. Als een stem nagemaakt kan worden, bijvoorbeeld van (overleden) beroemdheden, wie heeft er dan recht op? Iedereen kan er mee aan de haal gaan.<br />
<br />
<br />
<br />
=Meeting donderdag 04-09-2014=<br />
=====Mogelijke onderzoeksvragen op dit moment=====<br />
<br />
# Hoe beïnvloedt emotie in een robotstem de gebruiker van deze robot?<br />
# Hoe kunnen er emoties worden aangebracht in een robotstem?<br />
# Welke aspecten kenmerken bepaalde emoties en hoe kan dit worden gebruikt in een robotstem?<br />
<br />
=====Plan verdeeld per persoon======<br />
<br />
# Verzamelen van geluidsfragmenten<br />
# Aanpassen van signalen met een bepaald programma (pitch, frequentie, amplitude, duration)<br />
# Literatuurstudie doen naar kenmerken van emotie(s)<br />
# Target group maken die stemmen gaan evalueren<br />
<br />
<br />
<br />
=Presentatie maandag 08-09-2014=<br />
<br />
Punten die er in moeten:<br />
<br />
* Onderwerp<br />
* Doelstelling<br />
* Aanpak om de doelstelling te bereiken<br />
* (Hoe ver is de technologie nu?)<br />
<br />
Voor de slides en de uitleg die erbij hoort zie onderstaande link:<br />
<br />
[[File:Presentatie 08-09-2014.pdf]]<br />
<br />
======Slide 1: Emoties in spraak van een robot======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''Iris Huijben''<br />
<br />
''Meike Berkhoff''<br />
<br />
''Floor Fasen''<br />
<br />
''Suzanne Vugs''<br />
|}<br />
<br />
Uitleg:<br />
<br />
Wij gaan ons onderzoek richten op emoties in de spraak van een robot. Dit is een onderzoeksgebied dat in volle gang is. We zijn al begonnen met een literatuurstudie naar emoties in menselijk en kunstmatig stemgeluid.<br />
<br />
======Slide 2: Onderzoeksvraag======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''In hoeverre is het mogelijk om emoties te geven aan een robotstem?''<br />
|}<br />
<br />
======Slide 3: Doelstellingen======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''Communicatie van de robot menselijker maken om interactie te verbeteren.''<br />
<br />
''Het uitbreiden van bestaande mogelijkheden om emoties te tonen.''<br />
|}<br />
<br />
Uitleg:<br />
<br />
Waarom: Emoties geven aan een robot door spraak kan de mens-robot interactie verbeteren doordat de robot menselijker zal zijn.<br />
<br />
Wat: Robots kunnen op dit moment voornamelijk emoties uitdrukken door fysieke gebaren en mimiek. Er wordt op dit moment veel onderzoek gedaan naar emoties in spraak van robots. Wij willen met dit project onderzoeken welke aspecten van spraak er aangepast moeten worden om spraak te kunnen gebruiken om emoties te geven aan robots.<br />
<br />
======Slide 4: USE perspectief - User======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''- Gezelschapsrobot''<br />
<br />
''- Communicatiehulpmiddel''<br />
<br />
''- Zorgrobot''<br />
<br />
''- Persuasive technology''<br />
|}<br />
<br />
Uitleg:<br />
<br />
Gezelschapsrobots worden gebruikt om eenzaamheid tegen te gaan. Door deze robot emoties te geven kun je hoogst waarschijnlijk een betere band creëren met de gebruiker. Een communicatiehulpmiddel kan emoties gebruiken bij bijvoorbeeld mensen die een spraakgebrek hebben door een ziekte en hun eigen stem niet meer kunnen gebruiken. Een zorgrobot kan emoties gebruiken om vertrouwen te winnen bij de patiënten zodat de robot eerder wordt geaccepteerd in een huishouden. Het kan een toegevoegde waarde hebben bij Persuasive technology. Onderzoekers in dit gebied kunnen kijken of emoties in een robotstem bijdragen aan de overtuigingskracht van deze robot.<br />
<br />
======Slide 5: Aanpak van het onderzoek======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''- Literatuurstudie naar kenmerkende aspecten van emoties in stemgeluid.''<br />
<br />
''- Bepalen welke emoties wij gaan onderzoeken.''<br />
<br />
''- Gevonden aspecten implementeren in de Nao robot.''<br />
<br />
''- Eventueel feedback ontvangen van participanten.''<br />
|}<br />
<br />
Uitleg:<br />
<br />
Er zal eerst literatuuronderzoek gedaan moeten worden om te ontdekken welke aspecten bijdragen aan emoties in een stemgeluid. Mogelijke aspecten zijn bijvoorbeeld frequentie en toonhoogte. Er moet worden onderzocht welke aspecten wij kunnen gebruiken in ons onderzoek en nagegaan worden of wij deze aan kunnen passen in de Nao robot. Als daar nog ruimte voor is, willen we eventueel nog feedback ontvangen van participanten om te zien of we ons doel bereikt hebben. Participanten moeten dan aangeven welke emotie zij denken te horen in een stem (met keuze uit de 6 basisemoties van Ekman & Friesen).<br />
<br />
======Slide 6: Kennis over emoties======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''- Spreek tempo''<br />
<br />
''- Gemiddelde spreekhoogte''<br />
<br />
''- Spreiding spreekhoogte''<br />
<br />
''- Intensiteit''<br />
<br />
''- Stem kwaliteit''<br />
<br />
''- Hoogte veranderingen''<br />
<br />
''- Articulatie''<br />
|}<br />
<br />
Uitleg:<br />
<br />
De bovenstaande begrippen zijn aspecten waar al onderzoek naar is gedaan in een stem. Emoties hebben een bepaald invloed op je stem en deze begrippen typeren dat.<br />
<br />
======Slide 7: Mogelijkheden met spraak van Nao======<br />
{| border="1" style="text-align:left; width="40%;"<br />
|<br />
''- Amplitude, pitch etc.''<br />
<br />
''- Alternatief:''<br />
DECtalk –> tekst-spraak synthesiser<br />
|}<br />
<br />
Uitleg:<br />
<br />
De spraak van Nao kan op verschillende gebieden aangepast worden. Dit zijn bijvoorbeeld de pitch en amplitude. Het is voor ons nog niet duidelijk wat er precies aangepast kan worden. Hier gaan we meer literatuuronderzoek naar doen. Als uit ons literatuuronderzoek blijkt dat we graag meer aspecten aan willen passen dan mogelijk is met de Nao, gaan we op zoek naar een alternatief. Dit alternatief zal een spraakprogramma zijn die meer mogelijkheden heeft om geluid te beïnvloeden. Een alternatief dat we al uit bronnen hebben gevonden is DECtalk. Dit is een tekst-spraak synthesiser. <br />
<br />
<br />
<br />
= Feedback presentatie maandag 08-09-2014 =<br />
<br />
* Onderzoek naar bedrijven/instellingen/onderzoeksgroepen die leiden in dit onderzoek segment<br />
* Hoe ga je literatuur zoeken? Waar ga je literatuur vinden? (Geef bijvoorbeeld zoekwoorden die je gaat gebruiken.;)<br />
* Onderzoek naar het juiste programma voor de implementatie: Is de NAO hiervoor per se nodig? Amigo kan misschien ook gebruikt worden (spraaksynthese van Amigo is mede ontwikkeld door Philips research en hiervoor heeft de TU/e een licentie om te lenen). Verder is voor ons onderzoek ook alleen een spraakprogramma geschikt, het hoeft niet per se een robot te zijn. Dus zoek naar software met spraaksynthese die ook nog open source is.<br />
* Maak de user specifieker: kies één doelgroep en één soort robot waar we ons op willen richten.<br />
* Heb je alle genoemde aspecten van emoties in de presentatie nodig om verschil in emotie te onderscheiden? Deze kunnen misschien niet allemaal aangepast worden, en als ze allemaal even belangrijk zijn dan komen we dus niet zo ver.<br />
* Denk er over na dat wanneer je geen tijd hebt voor je onderzoek, je eigenlijk niet zo veel meer hebt. De literatuur zal dan uitgebreider moeten.<br />
* Het USE aspect is nog niet duidelijk. Er zijn namelijk veel meer partijen als users bij dit onderzoek betrokken (verzekeringsmaatschappij + hulpverleners krijgen een andere taak erbij).<br />
<br />
<br />
<br />
= Persoonlijke feedback week 1 =<br />
<br />
Algemeen:<br />
*Hou de productiviteit tijdens de vergadering hoog<br />
<br />
Meike:<br />
*Fijn dat je de taken hebt uitgevoerd<br />
*Meike geeft aan dat dit kwartiel druk wordt, dus dat ze voor zichzelf een goede planning moet maken om ervoor te zorgen dat ze niet in de problemen komt. <br />
<br />
Iris: <br />
*Fijn dat je de leiding nam donderdag en in de whatsapp groep, mag soms een positievere insteek hebben<br />
<br />
Floor: <br />
*Het is fijn dat je doet wat er gevraagd wordt<br />
*Let op je planning i.v.m de activiteiten die je naast de studie hebt<br />
<br />
Suzanne:<br />
*Fijn enthousiast<br />
*Heel wisselend qua interactie: heel enthousiast maar soms ook stil. <br />
*Wiki bijhouden en reageren op de app<br />
*Suzanne geeft aan dat het moeilijk was dat ze er vrijdag niet bij was. Hierdoor mist ze een deel, en voelde ze zich vervelend. Voor zichzelf besloten dat ze overal bij probeert te zijn/ betrokken zijn.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=16101Logboek2014-10-20T11:35:15Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 4 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur en 30 min<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 3 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur en 30 min<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 1 uur en 30 minuten <br />
| Heel de samenvatting doorgelezen en verbeterd (ook nieuwe linkjes aangemaakt hierin).<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 18 okt<br />
| 3 uur en 30 minuten <br />
| Eerste opzet van de presentatie gemaakt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten<br />
| Presentatie doorgekeken en feedback gegeven.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten <br />
| Feedback geven op presentatie.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 6 uur en 30 minuten <br />
| Presentatie afgemaakt en geoefend. Tevens de samenvatting doorgelezen en gecontroleerd.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 okt<br />
| 2 uur en 30 minuten <br />
| Presentatie geven aan naar andere presentaties luisteren<br />
| Suzanne + Iris + Meike + Floor<br />
|}<br />
<br />
<br />
Totaal aantal uren: 565 uur en 40 minuten</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=16100Logboek2014-10-20T11:33:27Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 4 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur en 30 min<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 3 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur en 30 min<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 1 uur en 30 minuten <br />
| Heel de samenvatting doorgelezen en verbeterd (ook nieuwe linkjes aangemaakt hierin).<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 3 uur en 30 minuten <br />
| Eerste opzet van de presentatie gemaakt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten<br />
| Presentatie doorgekeken en feedback gegeven.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 20 minuten <br />
| Feedback geven op presentatie.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 6 uur en 30 minuten <br />
| Presentatie afgemaakt en geoefend. Tevens de samenvatting doorgelezen en gecontroleerd.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 okt<br />
| 2 uur en 30 minuten <br />
| Presentatie geven aan naar andere presentaties luisteren<br />
| Suzanne + Iris + Meike + Floor<br />
|}<br />
<br />
<br />
Totaal aantal uren: 565 uur en 40 minuten</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=16083Logboek2014-10-19T22:02:58Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 4 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur en 30 min<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 3 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur en 30 min<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 1 uur en 30 minuten <br />
| Heel de samenvatting doorgelezen en verbeterd (ook nieuwe linkjes aangemaakt hierin).<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 3 uur en 30 minuten <br />
| Eerste opzet van de presentatie gemaakt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten<br />
| Presentatie doorgekeken en feedback gegeven.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 20 minuten <br />
| Feedback geven op presentatie.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 6 uur en 30 minuten <br />
| Presentatie afgemaakt en geoefend. Tevens de samenvatting doorgelezen en gecontroleerd.<br />
| Suzanne<br />
<br />
|}<br />
<br />
<br />
Totaal aantal uren: 555 uur en 40 minuten</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=16082Logboek2014-10-19T22:02:35Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 4 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur en 30 min<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 3 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur en 30 min<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 1 uur en 30 minuten <br />
| Heel de samenvatting doorgelezen en verbeterd (ook nieuwe linkjes aangemaakt hierin).<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 3 uur en 30 minuten <br />
| Eerste opzet van de presentatie gemaakt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten<br />
| Presentatie doorgekeken en feedback gegeven.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 20 minuten <br />
| Feedback geven op presentatie.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 6 uur en 30 minuten <br />
| Presentatie afgemaakt en geoefend. Tevens de samenvatting doorgelezen en gecontroleerd.<br />
| Suzanne<br />
<br />
|}<br />
<br />
<br />
Totaal aantal uren: 555 uur en 40 minuten</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15961Logboek2014-10-19T15:57:05Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 4 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur en 30 min<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 3 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur en 30 min<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 1 uur en 30 minuten <br />
| Heel de samenvatting doorgelezen en verbeterd (ook nieuwe linkjes aangemaakt hierin).<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 19 okt<br />
| 20 minuten<br />
| Presentatie doorgekeken en feedback gegeven.<br />
| Iris<br />
<br />
|}<br />
<br />
<br />
Totaal aantal uren: 545 uur en 20 minuten</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15917Samenvatting2014-10-19T13:07:10Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscle diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although they are working on such technology, it can be further improved by adding emotion to a person's own sound. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emotions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally even though it is only based on acoustic and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research five different programs were used; Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will be explained below.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. When sadness occurs, a sentence will be spoken slower than when happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(For the research ten different sentences were used and adjusted. To see more specific adjustments on these sentences, click here: [[Adjustments sentences]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed, a power analysis was done to get an idea of how many participants might be needed. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size, 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables. The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability, the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuasiveness two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates should measure how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
(The questions of the questionnaire can be found at [[Research design]].)<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participants''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants were removed, since they submitted the questionnaire two times. Besides, one extra person was removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participants who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception were used for this research: persuasion, likeability and animacy. The scales for these concepts were tested on reliability by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions, an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results, the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that, the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was non-significant, and the effect size hardly increased.<br />
<br />
Likeability also showed a non-significant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypothesis that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likeable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might have perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Researches are busy with technology that allows people with ALS to use their own voice with speech technology (EUAN MacDonald Centre). This research looks further than using the sound of the voice from people with ALS. By using someone’s own sound the level of animacy would improve a lot, but as this research shows adding acoustic features of emotion to a voice produced by a TTS, will also enhance the level of animacy. Also the perceived likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the Nao robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The Nao robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that were used in the research, which can also be changed in eSpeak are speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15916Samenvatting2014-10-19T13:06:29Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscle diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although they are working on such technology, it can be further improved by adding emotion to a person's own sound. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emotions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally even though it is only based on acoustic and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research five different programs were used; Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will be explained below.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. When sadness occurs, a sentence will be spoken slower than when happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(For the research ten different sentences were used and adjusted. To see more specific adjustments on these sentences, click here: [[Adjustments sentences]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed, a power analysis was done to get an idea of how many participants might be needed. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size, 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables. The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability, the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuasiveness two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates should measure how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
(The questions of the questionnaire can be found at [[Research design]].)<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participants''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants were removed, since they submitted the questionnaire two times. Besides, one extra person was removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participants who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception were used for this research: persuasion, likeability and animacy. The scales for these concepts were tested on reliability by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions, an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results, the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that, the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was non-significant, and the effect size hardly increased.<br />
<br />
Likeability also showed a non-significant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypothesis that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likeable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might have perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Researches are busy with technology that allows people with ALS to use their own voice with speech technology. (EUAN MacDonald Centre) This research looks further than using the sound of the voice from people with ALS. By using someone’s own sound the level of animacy would improve a lot, but as this research shows adding acoustic features of emotion to a voice produced by a TTS, will also enhance the level of animacy. Also the perceived likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the Nao robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The Nao robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that were used in the research, which can also be changed in eSpeak are speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15883Samenvatting2014-10-19T11:46:33Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscle diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although they are working on such technology, it can be further improved by adding emotion to a person's own sound. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emotions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally even though it is only based on acoustic and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research five different programs were used; Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will be explained below.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. When sadness occurs, a sentence will be spoken slower than when happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(For the research ten different sentences were used and adjusted. To see more specific adjustments on these sentences, click here: [[Adjustments sentences]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed, a power analysis was done to get an idea of how many participants might be needed. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size, 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables. The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability, the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuasiveness two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates should measure how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
(The questions of the questionnaire can be found at [[Research design]].)<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participants''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants were removed, since they submitted the questionnaire two times. Besides, one extra person was removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participants who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception were used for this research: persuasion, likeability and animacy. The scales for these concepts were tested on reliability by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions, an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results, the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that, the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was non-significant, and the effect size hardly increased.<br />
<br />
Likeability also showed a non-significant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypothesis that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likeable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might have perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Researches are busy with technology that allows people with ALS to use their own voice with speech technology. (BRON???!!!) This research looks further than using the sound of the voice from people with ALS. By using someone’s own sound the level of animacy would improve a lot, but as this research shows adding acoustic features of emotion to a voice produced by a TTS, will also enhance the level of animacy. Also the perceived likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the Nao robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The Nao robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that were used in the research, which can also be changed in eSpeak are speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15882Samenvatting2014-10-19T11:45:04Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscle diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although they are working on such technology, it can be further improved by adding emotion to a person's own sound. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emotions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally even though it is only based on acoustic and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research five different programs were used; Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will be explained below.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. When sadness occurs, a sentence will be spoken slower than when happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(For the research ten different sentences were used and adjusted. To see more specific adjustments on these sentences, click here: [[Adjustments sentences]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed, a power analysis was done to get an idea of how many participants might be needed. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size, 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables. The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability, the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuasiveness two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates should measure how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
(The questions of the questionnaire can be found at [[Research design]].)<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participants''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants were removed, since they submitted the questionnaire two times. Besides, one extra person was removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participants who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception were used for this research: persuasion, likeability and animacy. The scales for these concepts were tested on reliability by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions, an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results, the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that, the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was non-significant, and the effect size hardly increased.<br />
<br />
Likeability also showed a non-significant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypothesis that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likeable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might have perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Researches are busy with technology that allows people with ALS to use their own voice with speech technology. This research looks further than using the sound of the voice from people with ALS. By using someone’s own sound the level of animacy would improve a lot, but as this research shows adding acoustic features of emotion to a voice produced by a TTS, will also enhance the level of animacy. Also the perceived likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the Nao robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The Nao robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that were used in the research, which can also be changed in eSpeak are speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15783Logboek2014-10-17T10:04:57Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 2 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 17 okt<br />
| 45 minuten <br />
| Discussie nog een keer kritisch doorgelezen.<br />
| Iris<br />
|}<br />
<br />
<br />
Totaal aantal uren: 531 uur en 30 minuten.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15782Samenvatting2014-10-17T10:04:13Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology (OH JA? HEBBEN WE HIER EEN BRON VAN?), the implementation can be improved. By using someone’s own sound the level of animacy would improve a lot, but as this research shows adding acoustic features of emotion to a voice produced by a TTS, will enhance the level of animacy as well. Also the perceived likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15781Samenvatting2014-10-17T10:01:22Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Besides previous mentioned issues that can be improved, more improvements can be made. These limitations have an influence on all three measured concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. It is possible that the difference between the neutral voice and the emotionally loaded voice was somewhat hard to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method the spoken text that comes from Acapela is not computer-generated, it is recorded by a human speaker. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between either a sentence with acoustic features of emotion and without, a solution might be to use a mechanical voice. In the end it was decided not to use that for this research because it is already quite easy for manufacturers of robots and speech-programs to generate a better sounding voice than a robotic voice. The practical application of this research would therefore have decreased if the neutral condition had been a robotic voice.<br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. Namely, when multiple characteristics of emotions were combined, e.g acoustic and meaning of the sentence, it has a reinforcing effect. This reinforcing effect after a combination of different features was also found in previous researches. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15780Samenvatting2014-10-17T09:48:18Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15779Samenvatting2014-10-17T09:47:46Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability, not controlled for positive fragments, another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable. However, apparently when mostly positive fragments were heard, the robot was perceived more likeable. So probably the positive fragments sounded more human-like than the negative ones did. <br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15730Logboek2014-10-16T19:33:16Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 2 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 16 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
|}<br />
<br />
<br />
Totaal aantal uren: 530 uur en 45 minuten.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15729Logboek2014-10-16T19:33:01Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 2 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 45 minuten <br />
| Informatie van J. Lunenburg verwerken.<br />
| Iris<br />
|}<br />
<br />
<br />
Totaal aantal uren: 530 uur en 45 minuten.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15728Samenvatting2014-10-16T19:31:49Z<p>S126005: /* References */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Lunenburg, J. J. M. Contact person about TTS of Amigo. Contacted on: 16 October 2014.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15727Samenvatting2014-10-16T19:29:20Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (Philips, 2014) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15726Samenvatting2014-10-16T19:28:58Z<p>S126005: /* References */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (BRON) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Philips (2014). Text-to-speech. Retrieved from Philips: http://www.extra.research.philips.com/text2speech/ttsdev/index.html. <br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15725Samenvatting2014-10-16T19:28:19Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (BRON) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15724Samenvatting2014-10-16T19:28:08Z<p>S126005: /* Discussion and conclusion */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Persuasion.jpg |350px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Likeability1.jpg | 350px |Figure 2: Likeability per condition]]<br />
[[File: Likeability2.jpg | 350px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition --------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Animacy.jpg | 350px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, which can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. eSpeak therefore does not contribute to a flexible system that can easily be used for communicating emotional sentences. However, the TTS developed by Philips is already quite advanced. It is possible to select a certain emotion, including sad and exciting. (BRON) Besides that the speech of Amigo is generated real-time. This means that parts of the sentences are already predefined, but other parts are filled in by Amigo itself. Amigo also has the possibility to choose among multiple sentences for specific situations. (Lunenburg) These two characteristics of the TTS from Philips make Amigo more flexible to use for communicating emotional sentences than Nao. <br />
<br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots. At this moment applicability of the research depends on the kind of robot that is used, including its technical capabilities and purpose. <br />
<br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can also be used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15723Week 72014-10-16T19:05:58Z<p>S126005: /* Mail conact met Janno Lunenburg */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting maandag 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan René waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
<br />
===== Presentatie en beoordeling =====<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
===== TODO voor laatste week =====<br />
<br />
* Bronnen in inleiding in APA-style zetten<br />
* Do file opschonen misschien?<br />
* Do file op wiki<br />
<br />
<br />
Grafieken aanpassen:<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
Toepasbaarheid in robots:<br />
* Nao<br />
* Amigo<br />
<br />
<br />
Discussie afmaken:<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
Wiki netter maken:<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.<br />
<br />
= Meeting woensdag 15-10-2014 =<br />
<br />
We hebben discussie over de interpretatie van de resultaten. <br />
<br />
Moeten we vooral naar p waardes kijken, of zijn effect sizes belangrijker?<br />
<br />
We kregen bepaalde uitkomsten, hoe kunnen we deze verklaren?<br />
<br />
Onze conclusies hebben we geprobeerd zo goed mogelijk op te schrijven.<br />
<br />
<br />
= Meeting donderdag 16-10-2014 =<br />
Floor heeft de wiki gestructureerd en alle weken gelijk gemaakt. <br />
<br />
Meike en Floor hebben aan de grafieken gewerkt. Confidence intervals moesten eerst berekend worden, en daarna hebben we de grafieken in Excel gemaakt.<br />
<br />
Iris heeft informatie gezocht over de TTS van Amigo en Nao. Meneer Lunenburg is nog gemaild, maar dit is waarschijnlijk te laat. Met de informatie van het internet hebben we het uiteindelijk netjes kunnen afronden.<br />
<br />
Suzanne en Meike hebben verder geschreven aan puntjes die nog niet af waren voor de discussie.<br />
<br />
Suzanne heeft gekeken naar de 'woestijn bron'. Deze bleek niet de informatie te bevatten waar we naar zochten.<br />
<br />
= Mail conact met Janno Lunenburg =<br />
<br />
We hebben mail contact gehad met Janno Lunenburg. Deze meneer zit aan de software kant van de Amigo. Hij heeft ons informatie gegeven over het TTS van Amigo.<br />
<br />
Dit kregen we van hem:<br />
<br />
<br />
De spraaksynthese van AMIGO is van Philips. Informatie hierover kun je vinden op: http://www.extra.research.philips.com/text2speech/ttsdev/index.html<br />
<br />
Bij normaal gebruik zijn er vier dingen die je in kunt stellen (staat o.a. op voorgaande website onder ‘Features’):<br />
* Taal: AMIGO kan Nederlands en Engels<br />
* Stem: AMIGO heeft een aantal voorgeprogrammeerde stemmen: volgens mij man en vrouw in het Nederlands en Engels en een man in het Spaans en Frans. Overigens kun je een Nederlandse stem wel Engels laten praten (of andersom) maar dan krijg mogelijk dus een vreemd (of grappig) accent<br />
* Emotie: AMIGO gebruikt typisch neutraal, ‘excited’ of verdrietig<br />
* Personage (b.v. man, oude man, vrouw, jongen, meisje, robot, reus, dwerg en alien). Dit passen we echter nooit aan.<br />
<br />
<br />
Overigens kun je ook pitch etc. volledig custom aanpassen. Dit is wat bijvoorbeeld gebeurt in het voorbeeld ‘Singing TTS’ op bovenstaande website. Dit staat echter precies gedefinieerd in een tekstbestand en is een feature die wij nooit zelf gebruikt hebben (we hebben alleen de demo ooit laten horen).<br />
<br />
<br />
De spraak van AMIGO wordt allemaal realtime gegenereerd, er worden dus niet simpelweg geluidsbestanden op de robot gezet en afgespeeld. De zinnen zijn in principe voorgeprogrammeerd, maar hierin zit wel variatie. Zo heeft de robot in veel situaties een aantal mogelijke zinnen, waaruit er willekeurig een geselecteerd wordt. Ook moeten delen van de zin bijvoorbeeld ingevuld worden: als de robot zegt wat hij op een tafel ziet staan vult hij dat zelf aan (voorgeprogrammeerd is “I see {0}”, waar op {0} dus ingevuld wordt wat hij gezien heeft.<br />
<br />
Tenslotte hebben we in het verleden ook gebruik gemaakt van Google TTS (online) en Festival/E-speak (een Linux tool). Die laatste kun je volgens mij ook aanpassen maar ik heb daar geen ervaring mee.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15722Week 72014-10-16T19:05:40Z<p>S126005: /* Mail conact met Janno Lunenburg */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting maandag 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan René waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
<br />
===== Presentatie en beoordeling =====<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
===== TODO voor laatste week =====<br />
<br />
* Bronnen in inleiding in APA-style zetten<br />
* Do file opschonen misschien?<br />
* Do file op wiki<br />
<br />
<br />
Grafieken aanpassen:<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
Toepasbaarheid in robots:<br />
* Nao<br />
* Amigo<br />
<br />
<br />
Discussie afmaken:<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
Wiki netter maken:<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.<br />
<br />
= Meeting woensdag 15-10-2014 =<br />
<br />
We hebben discussie over de interpretatie van de resultaten. <br />
<br />
Moeten we vooral naar p waardes kijken, of zijn effect sizes belangrijker?<br />
<br />
We kregen bepaalde uitkomsten, hoe kunnen we deze verklaren?<br />
<br />
Onze conclusies hebben we geprobeerd zo goed mogelijk op te schrijven.<br />
<br />
<br />
= Meeting donderdag 16-10-2014 =<br />
Floor heeft de wiki gestructureerd en alle weken gelijk gemaakt. <br />
<br />
Meike en Floor hebben aan de grafieken gewerkt. Confidence intervals moesten eerst berekend worden, en daarna hebben we de grafieken in Excel gemaakt.<br />
<br />
Iris heeft informatie gezocht over de TTS van Amigo en Nao. Meneer Lunenburg is nog gemaild, maar dit is waarschijnlijk te laat. Met de informatie van het internet hebben we het uiteindelijk netjes kunnen afronden.<br />
<br />
Suzanne en Meike hebben verder geschreven aan puntjes die nog niet af waren voor de discussie.<br />
<br />
Suzanne heeft gekeken naar de 'woestijn bron'. Deze bleek niet de informatie te bevatten waar we naar zochten.<br />
<br />
= Mail conact met Janno Lunenburg =<br />
<br />
We hebben mail contact gehad met Janno Lunenburg. Deze meneer zit aan de software kant van de Amigo. Hij heeft ons informatie gegeven over het TTS van Amigo.<br />
<br />
Dit kregen we van hem:<br />
<br />
<br />
''De spraaksynthese van AMIGO is van Philips. Informatie hierover kun je vinden op: http://www.extra.research.philips.com/text2speech/ttsdev/index.html<br />
<br />
''Bij normaal gebruik zijn er vier dingen die je in kunt stellen (staat o.a. op voorgaande website onder ‘Features’):<br />
* Taal: AMIGO kan Nederlands en Engels<br />
* Stem: AMIGO heeft een aantal voorgeprogrammeerde stemmen: volgens mij man en vrouw in het Nederlands en Engels en een man in het Spaans en Frans. Overigens kun je een Nederlandse stem wel Engels laten praten (of andersom) maar dan krijg mogelijk dus een vreemd (of grappig) accent<br />
* Emotie: AMIGO gebruikt typisch neutraal, ‘excited’ of verdrietig<br />
* Personage (b.v. man, oude man, vrouw, jongen, meisje, robot, reus, dwerg en alien). Dit passen we echter nooit aan.<br />
<br />
<br />
Overigens kun je ook pitch etc. volledig custom aanpassen. Dit is wat bijvoorbeeld gebeurt in het voorbeeld ‘Singing TTS’ op bovenstaande website. Dit staat echter precies gedefinieerd in een tekstbestand en is een feature die wij nooit zelf gebruikt hebben (we hebben alleen de demo ooit laten horen).<br />
<br />
<br />
De spraak van AMIGO wordt allemaal realtime gegenereerd, er worden dus niet simpelweg geluidsbestanden op de robot gezet en afgespeeld. De zinnen zijn in principe voorgeprogrammeerd, maar hierin zit wel variatie. Zo heeft de robot in veel situaties een aantal mogelijke zinnen, waaruit er willekeurig een geselecteerd wordt. Ook moeten delen van de zin bijvoorbeeld ingevuld worden: als de robot zegt wat hij op een tafel ziet staan vult hij dat zelf aan (voorgeprogrammeerd is “I see {0}”, waar op {0} dus ingevuld wordt wat hij gezien heeft.<br />
<br />
Tenslotte hebben we in het verleden ook gebruik gemaakt van Google TTS (online) en Festival/E-speak (een Linux tool). Die laatste kun je volgens mij ook aanpassen maar ik heb daar geen ervaring mee.<br />
''</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15721Week 72014-10-16T19:04:45Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting maandag 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan René waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
<br />
===== Presentatie en beoordeling =====<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
===== TODO voor laatste week =====<br />
<br />
* Bronnen in inleiding in APA-style zetten<br />
* Do file opschonen misschien?<br />
* Do file op wiki<br />
<br />
<br />
Grafieken aanpassen:<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
Toepasbaarheid in robots:<br />
* Nao<br />
* Amigo<br />
<br />
<br />
Discussie afmaken:<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
Wiki netter maken:<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.<br />
<br />
= Meeting woensdag 15-10-2014 =<br />
<br />
We hebben discussie over de interpretatie van de resultaten. <br />
<br />
Moeten we vooral naar p waardes kijken, of zijn effect sizes belangrijker?<br />
<br />
We kregen bepaalde uitkomsten, hoe kunnen we deze verklaren?<br />
<br />
Onze conclusies hebben we geprobeerd zo goed mogelijk op te schrijven.<br />
<br />
<br />
= Meeting donderdag 16-10-2014 =<br />
Floor heeft de wiki gestructureerd en alle weken gelijk gemaakt. <br />
<br />
Meike en Floor hebben aan de grafieken gewerkt. Confidence intervals moesten eerst berekend worden, en daarna hebben we de grafieken in Excel gemaakt.<br />
<br />
Iris heeft informatie gezocht over de TTS van Amigo en Nao. Meneer Lunenburg is nog gemaild, maar dit is waarschijnlijk te laat. Met de informatie van het internet hebben we het uiteindelijk netjes kunnen afronden.<br />
<br />
Suzanne en Meike hebben verder geschreven aan puntjes die nog niet af waren voor de discussie.<br />
<br />
Suzanne heeft gekeken naar de 'woestijn bron'. Deze bleek niet de informatie te bevatten waar we naar zochten.<br />
<br />
= Mail conact met Janno Lunenburg =<br />
<br />
We hebben mail contact gehad met Janno Lunenburg. Deze meneer zit aan de software kant van de Amigo. Hij heeft ons informatie gegeven over het TTS van Amigo.<br />
Dit kregen we van hem:<br />
<br />
''De spraaksynthese van AMIGO is van Philips. Informatie hierover kun je vinden op: http://www.extra.research.philips.com/text2speech/ttsdev/index.html<br />
<br />
Bij normaal gebruik zijn er vier dingen die je in kunt stellen (staat o.a. op voorgaande website onder ‘Features’):<br />
· Taal: AMIGO kan Nederlands en Engels<br />
· Stem: AMIGO heeft een aantal voorgeprogrammeerde stemmen: volgens mij man en vrouw in het Nederlands en Engels en een man in het Spaans en Frans. Overigens kun je een Nederlandse stem wel Engels laten praten (of andersom) maar dan krijg mogelijk dus een vreemd (of grappig) accent<br />
· Emotie: AMIGO gebruikt typisch neutraal, ‘excited’ of verdrietig<br />
· Personage (b.v. man, oude man, vrouw, jongen, meisje, robot, reus, dwerg en alien). Dit passen we echter nooit aan.<br />
<br />
Overigens kun je ook pitch etc. volledig custom aanpassen. Dit is wat bijvoorbeeld gebeurt in het voorbeeld ‘Singing TTS’ op bovenstaande website. Dit staat echter precies gedefinieerd in een tekstbestand en is een feature die wij nooit zelf gebruikt hebben (we hebben alleen de demo ooit laten horen).<br />
<br />
De spraak van AMIGO wordt allemaal realtime gegenereerd, er worden dus niet simpelweg geluidsbestanden op de robot gezet en afgespeeld. De zinnen zijn in principe voorgeprogrammeerd, maar hierin zit wel variatie. Zo heeft de robot in veel situaties een aantal mogelijke zinnen, waaruit er willekeurig een geselecteerd wordt. Ook moeten delen van de zin bijvoorbeeld ingevuld worden: als de robot zegt wat hij op een tafel ziet staan vult hij dat zelf aan (voorgeprogrammeerd is “I see {0}”, waar op {0} dus ingevuld wordt wat hij gezien heeft.<br />
<br />
Tenslotte hebben we in het verleden ook gebruik gemaakt van Google TTS (online) en Festival/E-speak (een Linux tool). Die laatste kun je volgens mij ook aanpassen maar ik heb daar geen ervaring mee.<br />
''</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15720Week 72014-10-16T16:26:30Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting maandag 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan René waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
<br />
===== Presentatie en beoordeling =====<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
===== TODO voor laatste week =====<br />
<br />
* Bronnen in inleiding in APA-style zetten<br />
* Do file opschonen misschien?<br />
* Do file op wiki<br />
<br />
<br />
Grafieken aanpassen:<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
Toepasbaarheid in robots:<br />
* Nao<br />
* Amigo<br />
<br />
<br />
Discussie afmaken:<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
Wiki netter maken:<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.<br />
<br />
= Meeting woensdag 15-10-2014 =<br />
<br />
We hebben discussie over de interpretatie van de resultaten. <br />
<br />
Moeten we vooral naar p waardes kijken, of zijn effect sizes belangrijker?<br />
<br />
We kregen bepaalde uitkomsten, hoe kunnen we deze verklaren?<br />
<br />
Onze conclusies hebben we geprobeerd zo goed mogelijk op te schrijven.<br />
<br />
<br />
= Meeting donderdag 16-10-2014 =<br />
Floor heeft de wiki gestructureerd en alle weken gelijk gemaakt. <br />
<br />
Meike en Floor hebben aan de grafieken gewerkt. Confidence intervals moesten eerst berekend worden, en daarna hebben we de grafieken in Excel gemaakt.<br />
<br />
Iris heeft informatie gezocht over de TTS van Amigo en Nao. Meneer Lunenburg is nog gemaild, maar dit is waarschijnlijk te laat. Met de informatie van het internet hebben we het uiteindelijk netjes kunnen afronden.<br />
<br />
Suzanne en Meike hebben verder geschreven aan puntjes die nog niet af waren voor de discussie.<br />
<br />
Suzanne heeft gekeken naar de 'woestijn bron'. Deze bleek niet de informatie te bevatten waar we naar zochten.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15711Logboek2014-10-16T16:09:19Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 2 uur<br />
| Introductie afgemaakt en de bronnenlijst/verwijzingen gemaakt <br />
| Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 13 okt<br />
| 4,5 uur<br />
| Meeting inclusief coach, feedback en nieuwe planning<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 5 uur<br />
| Discussie over conclusie/discussie en verder schrijven.<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 15 okt<br />
| 6 uur en 45 minuten <br />
| Discussie afschrijven, wiki controleren en opschonen, grafieken verbeteren.<br />
| Floor + Meike + Iris + Suzanne <br />
|}<br />
<br />
Totaal aantal uren: 530 uur</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15710Samenvatting2014-10-16T15:57:46Z<p>S126005: /* Method */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features. The reason for this is that the combination of multiple features lead to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15708Samenvatting2014-10-16T15:39:27Z<p>S126005: /* References */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features as the combination of multiple features leads to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15707Samenvatting2014-10-16T15:38:48Z<p>S126005: /* References */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features as the combination of multiple features leads to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15706Samenvatting2014-10-16T15:38:05Z<p>S126005: /* References */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features as the combination of multiple features leads to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. Retrieved from Aldebaran: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html<br />
<br />
Audacity. (2014, September 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. Retrieved from eSpeak: http://espeak.sourceforge.net/<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Northern Arziona University EPS 625 – intermediate statistics. Retrieved from EPS 625: http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. Retrieved from Oddcast: http://www.oddcast.com/demos/tts/emotion.html<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15702Samenvatting2014-10-16T15:33:02Z<p>S126005: /* Resultaten */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (NAU EPS625). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discussed only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features as the combination of multiple features leads to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. [ONLINE] Available at: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html. [Last Accessed 16-10-2014]<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. [ONLINE] Available at: http://espeak.sourceforge.net/. [Last Accessed 16-10-2014].<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. [ONLINE] Available at: http://www.oddcast.com/demos/tts/emotion.html. [Last Accessed 16-10-2014].<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US<br />
(Yoo, & Gretzel, 2011)</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15698Samenvatting2014-10-16T15:29:20Z<p>S126005: /* Resultaten */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results. The y-axis represents the scale retrieved from the God speed questionnaire. This scale is from 1 to 5.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3, which has the same scale on the y-axis as figure 2. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. The y-axis of figure 4, represents the scale encoded from 1 to 5, retrieved from the God speed questionnaire.<br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
Looking at the results the concept persuasion did not result in a significant difference between the conditions. This can be seen by looking at the high p-value that was found. Besides that the effect size was taken into consideration and this was close to zero. Even after controlling for willingness to adjust showering habits, the effect was nonsignificant, and the effect size hardly increased.<br />
<br />
Likeability also showed a nonsignificant effect, but in contrast to persuasion a small effect size was found. However, a significant small to medium effect was found in likeability between the conditions when only participants were included that at least heard four out of five positive audiofragments. This makes sense, since likeability is in itself a positive concept. So the more happy a fragment sounds (happy versus neutral), the more likeable it is perceived. <br />
<br />
Both findings for persuasion and likeability go against the hypotheses that were formulated in the introduction. This can be explained by several things. At first for persuasion it yields that emotion might not be enough to persuade people into changing their behavior. Some participants gave the feedback that also the content of the sentence is of importance: there must be more information available about the water consumption and constructing arguments should be given. Leaving out information was done deliberately to only focus on the emotional context of the sentence instead of the informational context. <br />
<br />
For likeability another problem may affect the results. A lot of participants commented that the voice sounded too fake or robotic. It was found that the more human-like a robot is, the more accepted and likeable a robot is (Royakkers et al., 2012). Some participants did not find the used robotic-voice human-like and therefore probably did not find it very likable.<br />
<br />
Now persuasiveness and likeability are discusses only animacy is left. For animacy a significant effect was found between the two conditions. For the condition with emotion the perceived animacy was higher than for the condition without emotion. The effect size indicated that the difference between the two conditions has a medium effect (following the guidelines for a one-way anova, obtained from the Cognition and Brain Science Unit). Animacy was also tested for people who heard at least four positive audio fragments and for people who heard at least four negative fragments. Effect sizes were bigger for both these groups than the effect size of animacy in general. An explanation for this could be that people who heard the same emotions several times were more accustomed to that voice. Therefore they might perceived it more lively because they did not perceive any other voice where they could compare it with. However, these findings were not significant and the question is how realiable they are because the two groups consisted of 24 respectively 10 persons. <br />
<br />
The finding for animacy was in line with the hypothesis. An emotional voice is perceived more lively than a neutral voice. <br />
<br />
Other issues that can be improved deal with the design of the questionnaire. These limitations have an influence on all three concepts. To begin, given the time to complete this research, concessions had to be made about the size of the sample groups. According to the prior power analysis, the amount of participants that participated, was only enough to reliably find a medium effect. So to enhance reliability more participants would have been needed. A second problem might be the kind of speech program that was used. The difference between a sentence with acoustic features of emotions and without was not easy to hear. This again decreases the chance of finding an effect. The reason for this is the way the program created the voices. As was stated in the method actors recorded entire sentences. This is the base for the program. But is it possibe for a person to speak without any kind of emotion? To create a more obvious difference between a sentence with acoustic features of emotion and without a solution might be to use a machanical voice. In the end it was decided not to use that for this research because <br />
<br />
The outcome of this research is in accordance with the previous done research that is stated in the introduction. When multiple characteristics of emotions were combined, e.g acoustic and grammatical, it has a reinforcing effect. This is also the outcome of previous research. <br />
<br />
Now lets look back at the problem that was given at the beginning of this reserach: the extreme limited freedom of patients who suffer from ALS. As any human they strive for independence, but this becomes impossible in many cases as the illness develops. Increasing their freedom in any way would be a gift to them. The freedom to express yourself is the main focus of this research. Although technology exists that allows people with ALS to use their own voice with speech technology, the implementation can be improved. By using someone’s own sound the level animacy is improved. As this research shows adding acoustic features of emotion to a voice will enhance the level of animacy even more. Also the likeability when using certain emotions will be increased when implementing the acoustic features. Using these findings in speech technology will allow people with ALS to create a stronger emotional bond through speech with the people surroundig them. <br />
<br />
Meanwhile, this implementation might not be that useful for persuasive technology as stated in the introduction. After analysing the results there was no significant difference between using acoustic features of emotions or not and there was also no large effect. However, the concept of persuasiveness was taken into account for a broader implementation of the findings of this research. This concept is not necesarilly relevant to the issue of ALS. This does not mean that the findings of this research are not relevant for other implementations. When no anatomic features of emotion are available, the combination of acoustic and grammatical features has a positive effect on animacy and likeability. This can be used for social robots that cannot express themselves with mimicry. Examples of these kind of robots are the NAO robot and the Amigo robot. The research that was performed used different parameters to change the voices according to a certain emotion. The NAO robot has the option to change the pitch and the volume of the voice (Aldebaran, 2013). These are both acoustic features of emotion that were also used to create the voices in the conducted research. Besides that, speech rate and pitch variations within sentenes and words were manipulated, but in the explanation of the text to speech program of Nao nothing is said about speed changes. Pitch variations within sentences and words are also not mentioned as an option. This means that Nao can be used to create emotionally loaded sentences, however these sentences will not express the emotion that well as the sentences in this research did, because not all used parameters can be changed for Nao. The aim of Nao can vary a lot. If you know for wich goal it is used, a set of predefined sentences can be programmed that are (partially) adjusted to the right emotion. If improvements of the speech technology of Nao in the future ensure that the changeable parameters are expanded, Nao can become even more appropriate for messaging emotionally loaded sentences. <br />
The second mentioned social robot was Amigo. Amigo uses three text-to-speech programs; an experimental one made by Philips, tts of Google, and Ubuntu eSpeak (Voncken, 2013). In eSpeak it is possible to adjust the voice by hand. Parameters that are used in the research, and can also be changed in eSpeak are; speech rate and volume (eSpeak, 2007). However these options are only possible for a fragments. It is not possible to change these settings within a sentence or even a word. For Amigo the same conclusion can be made as for Nao. Not all parameters that were changed in this research can be changed for the voice of Amigo. So again if the aim of the robot is known, predefined sentences can be programmed and adjusted to create the right emotion. However this emotion will, just as for the Nao robot, not be expressed that well as in the performed research. Further improvements for the speech syntesizer of Amigo are needed to make Amigo extremely useful for producing emotionally loaded sentences. <br />
Overall the findings of this research are useful to increase perceived animacy and likeability of robots, but further improvements are needed to express the emotions more powerful. <br />
<br />
During this research a program was found (Oddcast, 2012) that can add specific sounds to a voice recording. Examples are ‘Wow’ or sobbing sounds. This can enhance the findings of this research. Thus for further research this research can be also used as the basis. Besides that more research needs to be done on the characteristics of acoustic features of emotions. Although the findings of previous research about this topic can be implemented, it can also be improved. An important remark needs to be made that it will always be difficult for people to recognise an emotion that is only based on acoustic features as the combination of multiple features leads to a correct recognition of a certain emotion.<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Aldebaran Commodities B.V. (2013). ALTextToSpeech. [ONLINE] Available at: http://doc.aldebaran.com/1-14/naoqi/audio/altexttospeech.html. [Last Accessed 16-10-2014]<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
eSpeak (2007). eSpeak text to speech. [ONLINE] Available at: http://espeak.sourceforge.net/. [Last Accessed 16-10-2014].<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Oddcast (2012). Text-to-speech. [ONLINE] Available at: http://www.oddcast.com/demos/tts/emotion.html. [Last Accessed 16-10-2014].<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Voncken, J. M. R. Investigation of the user requirements and desires for a domestic service robot, compared to the AMIGO robot.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US<br />
(Yoo, & Gretzel, 2011)</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15690Samenvatting2014-10-16T15:20:51Z<p>S126005: /* Resultaten */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions (Williams & Stevens, 1972) (Breazeal, 2001) (Bowles & Pauletto, 2010). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso, et al. (2004) say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness (Yoo, & Gretzel, 2011) (Vosse, Ham, & Midden, 2010). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. The y-axis represents a 5-point Lickert scale, that is encoded from -2 (totally disagree) to +2 (totally agree). Zero means neutral. <br />
<br />
After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results. <br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. <br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
<br />
== References ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004, October). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211). ACM<br />
<br />
Cognition and Brain Science Unit. (2014, October 2). Rules of thumb on magnitudes of effect sizes. Retrieved from CBU: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Royakkers, L., Damen, F., Est, R. V., Besters, M., Brom, F., Dorren, G., & Smits, M. W. (2012). Overal robots: automatisering van de liefde tot de dood.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Vossen, S., Ham, J., & Midden, C. (2010). What makes social feedback from a robot work? disentangling the effect of speech, physical appearance and evaluation. In Persuasive technology (pp. 52-57). Springer Berlin Heidelberg.<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.<br />
<br />
Yoo, K. H., & Gretzel, U. (2011). Creating more credible and persuasive recommender systems: The influence of source characteristics on recommender system evaluations. In Recommender systems handbook (pp. 455-477). Springer US<br />
(Yoo, & Gretzel, 2011)</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15486Week 72014-10-16T13:26:28Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan René waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
* Bronnen in inleiding in APA-style zetten<br />
* Do file opschonen misschien?<br />
* Do file op wiki<br />
<br />
<br />
<br />
'''Grafieken aanpassen'''<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
'''Toepasbaarheid in robots'''<br />
* Nao<br />
* Amigo<br />
<br />
<br />
'''Discussie afmaken'''<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
'''Wiki netter maken'''<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15272Week 72014-10-13T10:00:51Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
'''Grafieken aanpassen'''<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
'''Toepasbaarheid in robots'''<br />
* Nao<br />
* Amigo<br />
<br />
<br />
'''Discussie afmaken'''<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
* Bron van 'woestijn study' gebruiken.<br />
<br />
<br />
'''Wiki netter maken'''<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15271Week 72014-10-13T09:59:37Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
'''Grafieken aanpassen'''<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
'''Toepasbaarheid in robots'''<br />
* Nao<br />
* Amigo<br />
<br />
<br />
'''Discussie afmaken'''<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
<br />
<br />
'''Wiki netter maken'''<br />
* Alles per week ordenen<br />
* Zorgen dat alle belangrijke informatie in de samenvatting staat.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15270Week 72014-10-13T09:58:47Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
'''Grafieken aanpassen'''<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
<br />
'''Toepasbaarheid in robots'''<br />
* Nao<br />
* Amigo<br />
<br />
<br />
'''Discussie afmaken'''<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15269Week 72014-10-13T09:58:29Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
Grafieken aanpassen<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
Toepasbaarheid in robots<br />
* Nao<br />
* Amigo<br />
<br />
Discussie afmaken<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15268Week 72014-10-13T09:58:12Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
Grafieken aanpassen<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
Toepasbaarheid in robots<br />
* Nao<br />
* Amigo<br />
<br />
Discussie afmaken<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15267Week 72014-10-13T09:57:56Z<p>S126005: /* TODO voor laatste week */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
# Grafieken aanpassen<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
## Toepasbaarheid in robots<br />
* Nao<br />
* Amigo<br />
<br />
## Discussie afmaken<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15266Week 72014-10-13T09:57:33Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
Verslag wat we hebben is goed. Dit is het meest belangrijke van de wiki. De rest mag per week gestructureerd blijven maar daar moet dan minder aandacht op liggen.<br />
<br />
Effecten zijn erg klein, die zou je toch wel hoger willen zien.<br />
<br />
We moeten nog een keer naar de grafieken kijken want de schaal is van -2 tot 2. Leg goed uit waar de getallen voor staan (likert scale).<br />
<br />
Geen vereisten, maar wel een goede stijl als je dit doet:<br />
* Je wilt foutenbalken (95% confidence interval) -- we kunnen excel gebruiken hiervoor omdat je toch maar 2 getallen hebben.<br />
* 1 decimaal achter de punt is wel goed<br />
<br />
# Wat hebben we nou eigenlijk gemeten?<br />
# Kunnen we de hypothese beantwoorden<br />
# Geef een verklaring voor de resultaten (bedenk zelf)<br />
## Als het er al is, dan is het een klein effect<br />
## Mensen zouden alleen overtuigd raken door de informatie, niet door de emotie. Dus inhoudelijke informatie is belangrijk, hierdoor raken mensen eerder overtuigd. Dit kunnen we verstaven met het verslag van Alex. <br />
<br />
Op basis van deze resultaten kunnen wij zeggen dat emotie dus niets toevoegt voor overtuiging.<br />
<br />
De levendigheid verandert wel, dus mensen merken het verschil wel degelijk. Dit is eigenlijk wat we ook graag willen voor iemand met ALS<br />
<br />
Stukje user-aspect toevoegen aan het verslag. Laat dit terugkomen in de discussie. USER-perspectief, ook wel een beetje SOCIETY-perspectief.<br />
Stel dat je het wil gaan verkopen? Wat zou je dan zeggen om het goed over te laten komen? <br />
Het is niet zo dat dit binnen een week toegepast kan gaan worden. Het heeft nog een heleboel stappen te gaan (kost nu nog super veel geld). Wat zijn deze stappen?<br />
<br />
Met de kennis die jullie nu hebben opgedaan, kan dit ook met de Nao of Amigo robot? Zo ja hoe moet dat dan. Wat moet ik doen om dit toe te voegen aan mijn robot? Voldoen de requirements van deze robots om het toe te passen? Reflecteer met argumenten wat je er aan hebt.<br />
* Nao kan wel elk woord apart aanpassen. Je kunt ook .wav files inladen. Nao gebruikt Nuance. Ze zijn eigenlijk al verder met programmeren dan wat er nu op de robot zit.<br />
* Vraag aan Réne waar je kunt vinden hoe je het moet implementeren in Amigo.<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten. Ongeveer 10 minuten over het onderzoek en 5 minuten over het use-aspect.<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.<br />
<br />
<br />
== TODO voor laatste week ==<br />
<br />
# Grafieken aanpassen<br />
* schalen (y-as) grafieken bekijken<br />
* getallen boven balken afronden op 1 (misschien 2) decimalen<br />
* error bars weergeven<br />
<br />
# Toepasbaarheid in robots<br />
* Nao<br />
* Amigo<br />
<br />
# Discussie afmaken<br />
* Per y-variabele goede redenen geven waarom dit eruit komt (problemen van ons onderzoek en/of gebaseerd op bronnen)<br />
* Conclusie voor animacy<br />
* Gerelateerd aan eerdere bronnen<br />
<br />
#</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15262Week 72014-10-13T09:12:43Z<p>S126005: /* Presentatie */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
<br />
== Presentatie en beoordeling ==<br />
<br />
Duurt 15 minuten<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Week_7&diff=15261Week 72014-10-13T09:12:26Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
= Meeting 13-10-2014 =<br />
Suzanne heeft de inleiding afgemaakt. Dit moeten Iris, Meike en Floor dan nog eens goed doorlezen om te checken.<br />
<br />
We moeten de discussie, conclusie en de wetenschappelijke invloed / ethiek nog schrijven. De discussie hebben we al voor een groot deel geschreven in google docs. Nog niet op wiki gezet omdat het in Google Docs makkelijker is om aan te passen.<br />
<br />
== Feedback week 6 ==<br />
Allemaal:<br />
* We lopen nog steeds super goed op schema!<br />
* Zelfs op woensdagavond waren wij zeer productief.<br />
* Zelfstandigheid blijkt bij ons geen probleem.<br />
* Handig dat we een nieuwe planning hadden gemaakt wat we precies per meeting wilde doen.<br />
<br />
Iris:<br />
* Top: Goed dat je iedere de kennis goed terughaalt over wat we al weten over statistiek. We hebben de indruk dat jij er nog het meest van weet.<br />
* Tip: Vertrouwen dat het wel goed gaat moet je er toch weer proberen in te houden. Alles gaat goed, dus probeer het op een rijtje te houden.<br />
<br />
Suzanne:<br />
* Top: Goed bezig donderdag. Goed doorgetypt voor de discussie van de samenvatting. Heel fijn dat je zo je best hebt gedaan op alles op papier te zetten.<br />
* Top: Woensdag had je goede inbreng. Ook al weet je het minst over statistiek is het juist fijn om af en toe jouw visie er op te horen. <br />
* Tip: Gewoon vragen als je iets niet weet van Stata want wij vinden het juist fijn als we het je uit kunnen leggen.<br />
* Tip: Goed communiceren wat je aan het doen bent, want ook al ben je super nuttig bezig op je laptop (dingen opzoeken voor verduidelijking) lijkt het voor ons dat je niet op aan het letten bent.<br />
<br />
Meike:<br />
* Top: Je blijft maar door gaan met werken zonder dat je enige pauze nodig hebt. Super productief dus super goed.<br />
* Tip: Je had het idee dat we niet genoeg tijd hadden voor alle testen in de statistiek. Heb er vertrouwen in dat het goed komt, want het komt ook goed. :)<br />
<br />
Floor:<br />
* Top: Je bent woensdagavond erg productief bezig geweest. <br />
* Tip: Donderdagmiddag was je concentratie even wat minder.<br />
* Tip: Probeer wat beter om te gaan met de verleiding van je telefoon. <br />
<br />
<br />
== Coach meeting ==<br />
<br />
== Presentatie ==<br />
<br />
Duurt 15 minuten<br />
<br />
Presenteer technische vernieuwing, maar vooral in combinatie met user perspectief.<br />
<br />
Wat heb je beloofd in het begin, en wat heb je wel en niet kunnen halen?<br />
<br />
Na de presentatie is er een kwartier om vragen te stellen. Dit telt ook mee met de beoordeling.<br />
<br />
<br />
We worden beoordeeld op:<br />
* eerste 2 presentaties<br />
* eindpresentatie<br />
* discussie<br />
* wiki<br />
* proces<br />
* peer review<br />
<br />
Peer review:<br />
Maak er een met z'n allen. En stuur naar Rene. Geef elkaar ook absolute punten op 1 tot 10.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15259Samenvatting2014-10-13T08:03:36Z<p>S126005: /* Resultaten */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions ([[Samenvatting van de gevonden artikelen over kenmerken van emoties]]). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso et al. say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness ([[Bronnen persuasiveness]]). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition ----------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. <br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
<br />
== Sources ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15258Samenvatting2014-10-13T08:03:22Z<p>S126005: /* Resultaten */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions ([[Samenvatting van de gevonden artikelen over kenmerken van emoties]]). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso et al. say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness ([[Bronnen persuasiveness]]). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition -------------------------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. <br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
<br />
== Sources ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15257Samenvatting2014-10-13T08:02:13Z<p>S126005: /* Introduction */</p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions ([[Samenvatting van de gevonden artikelen over kenmerken van emoties]]). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. Brusso et al. say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. Scherer, Ladd, & Silverman (1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness ([[Bronnen persuasiveness]]). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition -------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. <br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
<br />
== Sources ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Samenvatting&diff=15256Samenvatting2014-10-13T07:59:52Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
Misschien nog interessant voor in de prestentatie:<br />
http://www.euanmacdonaldcentre.com/about-the-centre/euans-story/<br />
<br />
== Introduction ==<br />
<br />
<br />
Freedom is valuable to humans. It is so important that their right for freedom is protected by the constitution and several human right organizations. Freedom is related to both the physical- and mental state and it can be limited or taken away. Muscles diseases take away a person’s freedom on a physical level.<br />
<br />
A muscular disease that recently gained more attention is ALS. Mainly because of the ‘ice bucket challenge’ that went all over Facebook. The idea of the campaign was to gain more awareness for this disease so that money would be raised for further research. With ALS the neurons that are responsible for muscles movement will die off over time (Foundation ALS, 2014). Groups of muscles lose their function, because the ronsponsible neurons can no longer send a signal from the brain to the muscles. This process continues until vital muscles, like the muscles that helps a person to breathe, stop functioning. <br />
During their illness people who suffer from ALS feel like a prisoner in their own body. Their freedom is decreasing in multiple ways. For example at some point they may not be able to express themselves verbally as the vocal cords are muscles and can stop functioning. <br />
<br />
There is a team of reseachers (EUAN MacDonald Centre, 2014) that that tries to help people with ALS by giving them back their voice. They record the voice of ALS patients prior to the muscle failure, so that those recordings can be used in speech technology. Instead of hearing a computerized sound, ALS patients can hear their own voice when their ability to speak is impaired. This creates a stronger emotional bond between the patient and the loved ones surrounding them. Although such technology exists the company feels that it can be improved by adding emotions to the voice recordings. <br />
<br />
There are researches that have investigated the features of particular emotions ([[Samenvatting van de gevonden artikelen over kenmerken van emoties]]). These findings could be implemented in a speech program to add an emotion to a sentence. Although some emotions were recognized based on those features it was difficult for the sample group to successfully recognize certain emotions like sadness. Other studies came with an explanation for this problem. They state that emotion cannot be recognized by acoustic features alone. C.Brusso et all. say that the combination of acoustic features (pitch, frequency, etc.) and anatomical features (facial expressions) is more effective for the recognition of emotions. (Scherer, Ladd, & Silverman, 1984) found that the combination of acoustic features and grammatical features is effective. Thus both studies conclude that the combination of multiple characteristics of emtions leads to the correct recognition of a specific emotion. When implementing these findings into a speech program it is not possible to include anatomical features. Nevertheless it is important to find the best possible way for ALS patients to express themselves verbally eventhough it is only based on acoustic- and grammatical features, because it gives them an increased sense of freedom. <br />
<br />
The goal of this research is not to look further into the characteristics of emotions, but to investigate the strength of the combination of just the acoustic- and grammatical features. At first this research starts with sentences without emotion, but who express emotion grammatically. The sentence ‘You did so great, I thought it was amazing’ expresses the emotion ‘happy’ because of the words that were chosen. But how large or powerful is the effect when adding physical features of emotions? This question leads to the research question of this research:<br />
<br />
What is the effect on the perception of humans by adding acoustic features of emotions to a sentence with grammatical emotional features?<br />
<br />
Based on the knowledge of prior research the sentence with both acoustic and grammatical features should be more convincing. The hypothesis is that the recognition of emotion in a voice will have a positive affect on likeablity, animacy and persuasiveness ([[Bronnen persuasiveness]]). <br />
<br />
The outcome of this research can be used in speech technology for patients who suffer from ALS, but its use has a broader implementation. It can be used when no anatomical features of emotions are available, but when it is still necessary to communicate a certain emotion. With persuasive technology for example there are devices that want to convince people to behave in a certain way. Some devices use only voice recordings instead of an avatar. In these cases the use of acoustic- and grammatical features could have a stronger effect on the convincingness of the device which leads to the wanted change of behavior.<br />
<br />
== Method ==<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
* '''Materials'''<br />
For the research fives different programs were used. Acapela Box, Audacity, Google Forms, Microsoft Office Excel, and Stata. The use of the programs will now be explained.<br />
<br />
Acapela Box is an online text-to-speech generator which can create voice messages with your text and their voices. The voice of Will was used because this one is English (US) and has the different functions ‘happy’, ‘sad’ and ‘neutral’. (Acapela Group, 2009) To generate the voice of Will the speech of a person was recorded. Those parameters are used and implemented in the written text. While programming this, attention was paid to diaphones, syllables, morphemes, words, phrases, and sentences. (Acapela Group, 2014) The Acapela Box also gives you the opportunity to change the speech rate and the voice shaping. (Acapela Group, 2009) This can be very useful for different emotions. If sadness occurs, sentence will be spoken slower than if happiness occurs. This is because of long breaks between words and slower pronunciation. (Williams & Stevens, 1972) A person talks with 1.91 syllables per second if the person is sad and with 4.15 syllables per second if the person is angry. (Williams & Stevens, 1972) Because the speech rate of happy and angry does not differ much. It was chosen to use this value of 4.15 syllables per second for happiness. (Breazeal, 2001) Voice shaping changes the pitch of the voice. A happy voice has an average high pitch and a sad voice has an average low pitch. (Liscombe, 2007) So the Acapela Box was used to change this voice shape.<br />
<br />
Audacity is a free, open source, cross-platform software to record and edit audio. (Audacity, 2014) In Audacity you can use a lot of functions with which you can give the audio fragment an emotional tone. With the function ‘amplify’ you may choose a new peak-amplitude which is used to make the happy audio fragments louder than the sad fragments. The difference in amplitude between these two emotions is 7 dB on average according to the research by Bowles and Pauletto. The loudness of a neutral voice is close to that of the sad voice. (Bowles & Pauletto, 2010) The pitch of a voice while being sad decreases at the beginning of the sentence and remains rather constant at the second half of the sentence. (Williams & Stevens, 1972) A sad voice could also be recognized by the pitch decreasing at the end of the sentence. In contrast, being happy gives your voice a variety of different pitches. (Breazeal, 2001) In Audacity you can change the pitch of an individual word by selecting it, choosing the effect ‘adjust pitch’ and filling in the percentage you want to change the pitch into. Another feature is that a sad voice has longer breaks between two words comparing to all other emotions. (Bowles & Pauletto, 2010) By selecting a break between words and using the function ‘change tempo’ in Audacity, the length of this break can be made longer. This function can also help you to lengthen or shorten the words of a sentence individually. This is useful because words with only one syllable are pronounced faster when being happy. And if a person is sad, longer words are pronounced 20% slower and short words are pronounced 10-20% slower than a neutral voice. (Bowles & Pauletto, 2010). With the function ‘change tempo’ the speed of the voice changes, but the pitch does not. And this is exactly what we need for these emotions.<br />
<br />
(To see more specific adjustments on the sentences used, click here: [[Extra zinnen maken]])<br />
<br />
With Google Forms you can create a new survey with others at the same time. It is a tool used to collect information. Audio fragments (via video) can be inserted and every possible question can be written. After receiving enough data from the required participants, the information can be collected in a spreadsheet which can be exported to a .xlsx document (for the program Microsoft Office Excel). (Google Inc., 2014) <br />
<br />
Microsoft Office Excel is a program to save spreadsheets. This spreadsheet can be imported in the program Stata and from there on it is considered as data in different variables. Stata is used to interpret and analyze the data.<br />
<br />
<br />
* '''Design''' <br />
<br />
Before the experiment was executed a power analysis was done to get an idea of how many participants might be need. For the power analysis the power was set to 0.8 at a significance level of 0.05. For a moderate effect size 64 participants for each condition were needed in a two-tailed t-test and 51 participants for each condition were needed in a one-tailed t-test. A one-tailed t-test would be suitable for the experiment, because no relation or a positive relation was suspected. A negative relation was not to be expected. Therefore there is only one way in which there would be an effect and a two-tailed t-test would not be needed. <br />
However, it is presumably that the effect size is small instead of moderate. Another power analysis was executed to see how many participants were needed if the effect size is small. For a two-tailed t-test 394 participants per condition would be needed and for a one-tailed 310 participants per condition would be needed. Because the resources and time to collect so many participants were not available, the experiment is executed with 51 participants per condition. <br />
<br />
For our experiment a between subjects design was used. The two conditions researched were emotionally loaded voices and neutral voices. Each group only heard one of the conditions. These two conditions made up the independent variable, which therefore is a categorical variable. The dependent variables were likeability, animacy and persuasiveness. All these variables consisted of a Likert scale varying from 1 to 5. All the dependent variables are interval variables.The dependent variables were composed of several questions in the questionnaire. For the dependent variable likeability the questions 27, 30, 35, 38 and 40 from the questionnaire were used. Questions 28, 31, 32 and 34 were used for the dependent variable animacy. And the dependent variable persuasiveness composed of the questions 29, 33, 36 and 37.<br />
<br />
Because the survey is about water consumption and persuadibility two variables were made which could be of influence on the way participants responded on the comments of Will. These covariates are how easy you are to convince and how much you care about the environment. Both covariates are composed of several questions in the questionnaire. How easily you are convinced is composed of the questions 5, 8, 13 and 18. How much you care about the environment is composed of the questions 4, 6, 10, 11, 12, 15, 17 and 19. <br />
<br />
The questions of the questionnaire can be found at [[Opzet onderzoek 2.0]]<br />
<br />
<br />
* '''Procedure'''<br />
<br />
Each participant received a questionnaire. This questionnaire contained three parts.<br />
The first part is the general one in which some demographic information was asked. Besides some questions were asked about two personal characteristics. Furthermore, some questions were included to prevent that participants directly knew about our research goal. The second part consisted of a simulation in which everyone was supposed to fill in some questions about their showering habits. After each answer, an audio fragment was heard which either gave positive or negative feedback. The third part contained questions about the experience of the voice heard. At last, we added some final questions to give us an indication about general matters such as concentration and comprehensibility. <br />
<br />
<br />
* '''Participanten''' <br />
<br />
The participants (n = 101) were personally asked to fill in the questionnaire. They were gathered from our list of friends on our Facebook accounts. Facebook was used to prevent that elderly, which are unable to change their showering habits due to living in a care home, filled in the questionnaire. Besides, only participants older than 18 were asked, since younger people do often not pay their own energy bill.<br />
<br />
From the original data set, three participants are removed, since they submitted the questionnaire two times. Besides, one extra person is removed from the dataset, since this participant commented that he had not understood the questions about Will. Furthermore participant who totally disagreed when they were asked whether they master the English language, were removed. At last, participants were removed who were totally distracted while filling in the questionnaire. After this we were left with 94 participants.<br />
<br />
Each participant got either the questionnaire with the neutral voice (n1 = 46) or with the voice that sounds emotionally loaded (n2 = 48). The condition with the neutral voice is called category 1, and the other condition is called category 2.<br />
<br />
The first category consisted of 15 men and 31 women, and their age was between 18 and 57. (M = 27.5, SD = 12.2). The second category consisted 18 men and 30 women, and their age was between 18 and 57. (M = 29.6, SD = 13.3). The highest completed degree of education differed among primary school (0.0%; 4.2%), mavo (2.2%; 2.1%), havo (4.4%; 8.3%), vwo (39.1%; 31.3%), mbo (6.5%; 12.5%), hbo (15.2%; 22.9%), and university (32.6%; 18.8%). The first value is for category 1, and the second is for category 2.<br />
<br />
== Resultaten ==<br />
<br />
As stated in the introduction three concepts of perception are used for this reserach: persuasion, likeability and animacy. The scales for these concepts were tested on realibility by calculating Cronbach's alpha. These values were respectively 0.85, 0.89, and 0.85.The concepts will be discussed in the given order. <br />
<br />
To test whether persuasion of the voice is perceived differently between both conditions an ANOVA was performed. This resulted in p = 0.96, and <math>\eta ^2</math> = 0.00003. A graphical representation of this test can be seen in figure 1. After this, a second test was done in which participants were only included if they said that they are willing to adjust their showering habits. In this case the p-value was 0.95, and <math>\eta ^2</math> = 0.00005. <br />
<br />
For participants who either only heard at least four positive (p = 0.43; <math> \eta ^2 </math> = 0.03) or four negative audiofragments (p = 0.69; <math> \eta ^2 </math> = 0.02) it was tested whether persuasion differed between the conditions. <br />
<br />
[[File: Graph bar persuasion.png |500px|Figure 1: Persuasion per condition]]<br />
<br />
Figure 1: Persuasion per condition<br />
<br />
<br />
The third test that was performed was to see whether the participants in the condition with emotion rated the voice as more likeable compared to the participants in the neutral condition. Since not all assumptions for an ANOVA were met (normal distribution across both conditions was rejected), the non-parametric kwallist test was performed. This gave a p-value of 0.17, and <math>\eta ^2</math> = 0.02. This effect was calculated by using the following formula: <math> \eta ^2=\frac{\chi ^2}{N-1}</math> (http://oak.ucc.nau.edu/rh232/courses/EPS625/Handouts/Nonparametric/The%20Kruskal-Wallis%20Test.pdf). Figure 2 shows these results.<br />
<br />
As a follow up test, the kwallis was executed another two times, but now one time with only participants that at least heard four positive audiofragments (p = 0.02; <math> \eta ^2 </math> = 0.22), and the second time with only participants that at least heard four negative audiofragments (p = 0.73; <math> \eta ^2 </math> = 0.01). The effect of mainly hearing positive audiofragments on likeability is shown in figure 3. <br />
<br />
[[File: Graph bar likeability.png | 500px |Figure 2: Likeability per condition]]<br />
[[File: Graph bar likeability when emotion is mostly positive.png | 500 px | Figure 3: Likeability per condition when heard audio-fragments were mostly positive]]<br />
<br />
Figure 2: Likeability per condition -------------------------------------------------------------------------- <br />
Figure 3: Likeability per condition when heard audio-fragements were mostly positive<br />
<br />
<br />
Finally, a kwallis test was done to test whether there is a difference in animacy between both conditions. A kwallis test was chosen because the assumptions were not met (rejection of equal variance for both groups). The p-value found was 0.02 and <math>\eta ^2</math> = 0.06. The found difference can be seen in figure 4. <br />
<br />
To test whether the type of heard emotion has some influence on perceived animacy, participants who heard at least four positive fragments (p = 0.16; <math> \eta ^2 </math> = 0.08) were tested, as well as participants who heard at least four negative fragments (p= 0.17; <math> \eta ^2 </math> = 0.21).<br />
<br />
[[File: Graph bar animacy.png | 500px |Figure 4: Animacy per condition]]<br />
<br />
Figure 4: Animacy per condition<br />
<br />
== Discussion and conclusion ==<br />
<br />
== Sources ==<br />
<br />
Acapela Group. (2009, November 17). Acapela Box. Retrieved from Acapela: https://acapela-box.com/AcaBox/index.php<br />
<br />
Acapela Group. (2014, September 26). How does it work? Retrieved from Acapela: http://www.acapela-group.com/voices/how-does-it-work/<br />
<br />
Audacity. (2014, september 29). Audacity. Retrieved from Audacity: http://audacity.sourceforge.net/?lang=nl<br />
<br />
Bowles, T., & Pauletto, S. (2010). Emotions in the voice: humanising a robotic voice. York: The University of York.<br />
<br />
Breazeal, C. (2001). Emotive Qualities in Robot Speech. Cambridge, Massachusetts: MIT Media Lab.<br />
<br />
Google Inc. (2014, April 14). Google Forms. Retrieved from Google: http://www.google.com/forms/about/<br />
<br />
Liscombe, J. J. (2007). Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency. Comlumbia: Columbia University.<br />
<br />
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76(5), 1346-1356.<br />
<br />
Williams, C. E., & Stevens, K. N. (1972). Emotions and Speech: Some Acoustical Correlates . Cambridge, Massachusets: Massachusetts Institute of Technology.</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15196Logboek2014-10-11T12:21:08Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor <br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week 6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week 6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 9 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
|}<br />
<br />
Totaal aantal uren: 463 uur</div>S126005https://cstwiki.wtb.tue.nl/index.php?title=Logboek&diff=15157Logboek2014-10-10T10:05:33Z<p>S126005: </p>
<hr />
<div>Terug: [[PRE_Groep2]]<br />
----<br />
<br />
<br />
{| border="1" class="wikitable" cellpadding="2" width="100%"<br />
|- valign="top"<br />
! scope="row" width="5%" | Datum<br />
! width="10%" | Tijd<br />
! width="60%" | Beschrijving<br />
! width="25%" | Wie?<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Inleidend college met uitleg van het project<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 1 sept<br />
| 1 uur<br />
| Onderwerp bedenken in college<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 4 uur<br />
| Bronnen zoeken over technieken waarmee je kunt schrijven zonder je handen (dus met ogen of hersenen)<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 sept<br />
| 3 uur<br />
| Bronnen zoeken over emoties in stemgeluid (en toegepast in robots)<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over in hoeverre op dit moment techniek ontwikkeld is om robots te laten spreken.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 sept<br />
| 4 uur<br />
| Bronnen zoeken over voice-cloning<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 sept<br />
| 6 uur<br />
| Bronnen delen en bespreken wat we willen en wat we gaan doen<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 30 min<br />
| Afspraak met Raymond Cuijpers om vragen te stellen over mogelijkheden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 sept<br />
| 4 uur<br />
| Uiteindelijk idee vastleggen en presentatie voor maandag voorbereiden<br />
| Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 1 uur<br />
| Artikel over emoties gelezen en aan presentatie gewerkt<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 6 sept<br />
| 15 min<br />
| Suzanne gebeld voor een update over vrijdag<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 20 min<br />
| Uitleg van de slides verbeterd en onderzoeksvraag verbeterd<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 75 min<br />
| Gezocht naar bronnen over kenmerkende aspecten van emoties in spraak<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 7 sept<br />
| 45 min<br />
| Artikel over emoties gelezen en wiki verbeterd<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Presentatie voorbereid<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1,5 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 2 uur<br />
| College met presentaties<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Vergadering: feedback, algemene actiepuntjes, vaste dagen voor meetings afspreken en klein plan maken.<br />
| Meike + Suzanne + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 30 min<br />
| Wiki week 1 opgeschoond en duidelijk neergezet<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Bronnen gezocht over text-to-speech systemen en informatie over Amigo<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 8 sept<br />
| 1 uur<br />
| Functies van Nao op gebied van spraak opgezocht<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 10 min<br />
| Feedback presentatie maandag 8 september aangevuld<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 2 uur<br />
| Bronnen gezocht<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 4 uur<br />
| TTS systemen gezocht, gekozen en verdiept. Oa in matlab<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 1,5 uur<br />
| Artikelen emoties in spraak gezocht<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 9 sept<br />
| 30 min<br />
| Artikel over emoties gelezen die Suzanne heeft gevonden<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 90 min<br />
| In Matlab gewerkt met pitch en spreeksnelheid.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Artikelen die gisteren gevonden zijn gescand op relevantie.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 10 sept<br />
| 1 uur<br />
| Bronnen verwerkt en de wiki bij gewerkt<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Artikel Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency gevonden, gelezen en uitleg op de wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 6 uur<br />
| Voortgang bespreken, andere invalshoek op het onderzoek<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 3 uur<br />
| Zeer uitgebreide planning op wiki gezet ([[Planning groep 2]])<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 11 sept<br />
| 2 uur<br />
| Design van de planning voor de presentatie gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 2 uur en 15 min<br />
| Bronnen gezocht over onze nieuwe invalshoek.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 15 min<br />
| Uitkomst bijeenkomst donderdag 11 september op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 12 sept<br />
| 30 min<br />
| Milestones en deliverables vastgesteld en aan presentatie gewerkt<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 3 uur<br />
| Presentatie gemaakt en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 1 uur<br />
| Begonnen aan overzicht kenmerken emoties + waardes<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 13 sept<br />
| 2 uur en 30 min<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 14 sept<br />
| 5 uur<br />
| Overzicht kenmerken emoties + waardes afmaken<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 1 uur 30 min<br />
| Feedback vorige week en de planning doornemen van deze week.<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 2 uur<br />
| Presentaties<br />
| Meike + Iris + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 4 uur<br />
| Nieuw TTS programma gezocht die geen aanpassingen heeft aan de stem en Matlab functies aangepast<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 15 sept<br />
| 15 min<br />
| Feedback presentatie maandag 15 september in week 2 gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 5 uur en 15 min<br />
| Opties om aspecten van de robotstem aan te passen uitgezocht<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 16 sept<br />
| 1 uur en 30 min.<br />
| Onderzoek naar de beste manier voor het uitwerken van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 3 uur<br />
| Bronnen gezocht over persuasiveness en op wiki gezet<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 17 sept<br />
| 4 uur<br />
| Uitwerken opzet voor de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 5 uur<br />
| Group meeting waarbij we het idee van de survey hebben bedacht en elkaar hebben verteld wat we deze week hebben gedaan.<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 1,5 uur<br />
| De definitieve zinnen bekijken hoe we dit aan moeten pakken<br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 18 sept<br />
| 2 uur<br />
| Opzet van survey bedenken en uitwerken<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 19 sept<br />
| 8 uur<br />
| Robotstem opnemen en bewerken. Het is nu klaar: [[File:Robotstemmen.zip]]<br />
| Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 3 uur<br />
| Mogelijkheden online enquêtes uitzoeken.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 5 uur<br />
| Uitwerken definitieve versie van de enquetes<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 20 sept<br />
| 1 uur en 30 min.<br />
| Overleg over de enquetes via Skype<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 2 uur<br />
| Enquête opzet bespreken / voortgang bespreken<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Coach meeting<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 22 sept<br />
| 1 uur<br />
| Feedback sessie<br />
| Suzanne + Meike + Iris + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1,5 uur<br />
| Geluidsfragmenten op YouTube zetten<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 23 sept<br />
| 1 uur<br />
| Power-analyse gedaan.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 24 sept<br />
| 30 min<br />
| Bespreken wat we gaan doen met het aantal participanten<br />
| Floor + Iris + Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking met Raymond en nabespreking.<br />
| Iris + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 1 uur en 30 min<br />
| Bespreking voortgang.<br />
| Iris + Meike + Floor + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 25 sept<br />
| 45 min<br />
| Wiki bijgewerkt.<br />
| Iris <br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 1 uur<br />
| Bespreking enquete.<br />
| Suzanne + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 3 uur en 45 min.<br />
| Enquete in google docs zetten.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 6 uur<br />
| Extra zinnen gemaakt. <br />
| Floor + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Wiki bijgewerkt.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 26 sept<br />
| 15 min<br />
| Extra zinnen op youtube gezet.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 3 uur<br />
| Godspeed vragenlijst voor het einde van de enquete opgesteld en verder gewerkt aan enquete in Google docs.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 15 min<br />
| Questionnaire tot nu toe checken en kijken voor verbeteringen<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 27 sept<br />
| 5 uur<br />
| Een gedeelte van de questionnaire gemaakt in google docs<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| [[Feedback questionnaire]] geven<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 3 uur<br />
| Godspeed vragenlijst in enquete op Google docs verwerkt. <br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 4 uur<br />
| Pilot study gedaan met mijn ouders en dit geevalueerd<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 1 uur<br />
| Overleg over de enquête<br />
| Meike + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 15 min<br />
| Mail van [[Acapelabox]] bekijken en informatie verwerken<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 28 sept<br />
| 30 min<br />
| Feedback questionnaire <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 3 uur<br />
| Bespreken voortgang, persoonlijke feedback geven en questionnaire verbeteren <br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Coach meeting<br />
| Iris + Meike + Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Filmpjes opnieuw uploaden op YouTube omdat niet alles goed was gegaan vorige keer.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min<br />
| Verschillende versies van questionnaires gemaakt. <br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 30 min<br />
| Checken of alle filmpjes kloppen in de 4 verschillende questionnaires.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 1 uur en 30 min. <br />
| Enquetes nog eens nalopen, vragen aangepast en schoonheidsfoutjes weggewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 29 sept<br />
| 45 min<br />
| Enquetes nog eens nalopen, en foutjes eruit gehaald.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 1 uur en 30 min<br />
| Mensen uitnodigen voor enquete.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 2 uur<br />
| Mensen uitnodigen voor enquete.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 30 sept<br />
| 3 uur en 30 min<br />
| Doorkijken van de enquetes en mensen uitnodigen voor enquete<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 1 okt<br />
| 30 min<br />
| Reminders voor de enquete sturen en naar verzamelde data gekeken.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 5 uur<br />
| Naar data gekeken en analyse uitgedacht. <br />
| Meike + Floor + Suzanne + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 2 okt<br />
| 1 uur <br />
| Procedure geschreven.<br />
| Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 1 uur <br />
| Aan design gewerkt (methode)<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 30 min <br />
| Herinnering sturen naar mensen voor het invullen van de questionnaire.<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 6 uur en 30 min<br />
| 'Participants' schrijven en data coderen.<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 3 okt<br />
| 2 uur <br />
| Mensen herinnerd aan het invullen van de enquete en een opzet gemaakt voor de inleiding<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 50 min<br />
| Design (methode) afgemaakt en opzet onderzoek 2.0 (week 4) bijgewerkt.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 2 uur en 15 min<br />
| Verder gewerkt aan de inleiding en nieuwe mensen proberen te benaderen voor het invullen van de enquete.<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 1 uur en 30 min<br />
| Data coderen<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 3 uur<br />
| Eerste versie van de inleiding afgemaakt en op de wiki geplaatst<br />
| Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 4 okt<br />
| 45 min<br />
| Methode doorgelezen en laatste participant gezocht<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 3 uur<br />
| Methode (materials) eerste deel gemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 6 uur<br />
| Coach meeting, data coderen.<br />
| Floor + Meike + Iris + Suzanne<br />
<br />
|- valign="top"<br />
! scope="row" | 5 okt<br />
| 1 uur en 30 min.<br />
| Data codering controleren<br />
| Floor + Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 30 min.<br />
| Methode (materials) afgemaakt<br />
| Floor<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 15 min.<br />
| Methode (design) aangepast en bron toegevoegd aan inleiding.<br />
| Meike<br />
<br />
|- valign="top"<br />
! scope="row" | 6 okt<br />
| 1 uur en 15 min.<br />
| Codering checken<br />
| Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 3 uur en 45 min.<br />
| Alle taken van dinsdagochtend uitgevoerd. Zie [[Week6]]<br />
| Floor + Meike + Iris<br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 6 uur<br />
| Alle taken van woensdagavond uitgevoerd en een begin gemaakt aan taken van donderdagmiddag. Zie [[Week6]]<br />
| Floor + Meike + Iris + Suzanne <br />
<br />
|- valign="top"<br />
! scope="row" | 7 okt<br />
| 6 uur<br />
| Resultaten geschreven en begin aan discussie. <br />
| Floor + Meike + Iris + Suzanne <br />
|}<br />
<br />
Totaal aantal uren: 463 uur</div>S126005