FINAL STATUS REPORT SUBMITTED BY

Similar documents
Robot: Geminoid F This android robot looks just like a woman

Testing of the FE Walking Robot

E90 Project Proposal. 6 December 2006 Paul Azunre Thomas Murray David Wright

Project Multimodal FooBilliard

SERVO INDEXING AT MECHANICAL INDEXER PRICES

Designing with Parametric Sketches

BROWNCOATS Team 7842 Engineering Notebook - Rover Ruckus

Build your own. Pack. Stages 23-26: Begin assembling Robi s right foot

A Kinect-based 3D hand-gesture interface for 3D databases

Computer Numeric Control

Implementing Physical Capabilities for an Existing Chatbot by Using a Repurposed Animatronic to Synchronize Motor Positioning with Speech

Implement a Robot for the Trinity College Fire Fighting Robot Competition.

Application Areas of AI Artificial intelligence is divided into different branches which are mentioned below:

Project Identity. Assistive Robotic Arm Week 9 March April 4, 2007 Megan Madariaga

Retractable Pool Cover

Rythmik Audio. DS1500v - 3 cu ft vented enclosure. Articulate bass for the discerning audiophile. Feet Dimensions: 1.5 x 1.5 x 2.5

Animatronic Kinect Bear

A*STAR Unveils Singapore s First Social Robots at Robocup2010

A Model Based Approach for Human Recognition and Reception by Robot

Build your own. Pack. Stages 19-22: Continue building Robi s left arm

Controlling Humanoid Robot Using Head Movements

Robotic Systems Challenge 2013

Programmable Ferrofluid Display

Virtual Reality Calendar Tour Guide

Tele-Operated Anthropomorphic Arm and Hand Design

Note - the nose ribs and are thinner than the main ribs. These nose ribs will use a thinner rib cap than the ribs. This is per design.

Documents for the Winning Job Search

MAKER: Development of Smart Mobile Robot System to Help Middle School Students Learn about Robot Perception

Let s Talk: Conversation

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA

Sensible Chuckle SuperTuxKart Concrete Architecture Report

Humanoid Robots. by Julie Chambon

How to Draw a Cartoon Girl

Mini Hexapodinno. 18-DOF Robot

Music Manipulation through Gesticulation

Team Autono-Mo. Jacobia. Department of Computer Science and Engineering The University of Texas at Arlington

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

A New Simulator for Botball Robots

GESTURE RECOGNITION SOLUTION FOR PRESENTATION CONTROL

Say Goodbye Write-up

Prof. Subramanian Ramamoorthy. The University of Edinburgh, Reader at the School of Informatics

System Readiness Review

Robot: Robonaut 2 The first humanoid robot to go to outer space

Energy. Amazing. Transformers. We live with a dizzying array of electronic. Coffee Can Speakers:

Learning Plan. My Story Portrait Inspired by the Art of Mary Cassatt. Schedule: , Grades K-5, one class period of approximately 60 min.

Site. General Requirements Evaluation*

Grade 4 Mathematics Sample PT Form Claim 4

KINECT CONTROLLED HUMANOID AND HELICOPTER

Build your own. Stages 7-10: See Robi s head move for the first time

Lab Design of FANUC Robot Operation for Engineering Technology Major Students

New Skills: Finding visual cues for where characters hold their weight

SIMULATION MODELING WITH ARTIFICIAL REALITY TECHNOLOGY (SMART): AN INTEGRATION OF VIRTUAL REALITY AND SIMULATION MODELING

Structure Design and Analysis of a New Greeting Robot

OWEN Walking Robot Install Guide

E Technology: A. Innovations Activity: Introduction to Robotics

Notification System Installation Guide

Listed below are the competencies required and examples from the aforementioned job:

Gesture Recognition with Real World Environment using Kinect: A Review

GST BOCES. Regional Robotics Competition & Exhibition. May 29, :00 2:00. Wings of Eagles Discovery Center, Big Flats NY. Mission Mars Rover

Georgia Performance Standards Framework for Mathematics Grade 6 Unit Seven Organizer: SCALE FACTOR (3 weeks)

idocent: Indoor Digital Orientation Communication and Enabling Navigational Technology

Positive Promotion: Use the FIRST and FTC logos in a manner that is positive and promotes FIRST.

COGNITIVE MODEL OF MOBILE ROBOT WORKSPACE

VICs: A Modular Vision-Based HCI Framework

Gael Force FRC Team 126

Blind Spot Monitor Vehicle Blind Spot Monitor

J. La Favre Fusion 360 Lesson 2 April 19, 2017

Multi-Modal User Interaction

Exercise 2. Point-to-Point Programs EXERCISE OBJECTIVE

DEVELOPMENT OF A HUMANOID ROBOT FOR EDUCATION AND OUTREACH. K. Kelly, D. B. MacManus, C. McGinn

DIY Zoning: Dampers. Table of contents

Design and Analysis of Articulated Inspection Arm of Robot

ARDUINO ROBOTICS PROJECTS BY MR ROBERT J DAVIS II

Right Angle Screwdriver

Obtained from Omarshauntedtrail.com

The real impact of using artificial intelligence in legal research. A study conducted by the attorneys of the National Legal Research Group, Inc.

Journal of Theoretical and Applied Mechanics, Sofia, 2014, vol. 44, No. 1, pp ROBONAUT 2: MISSION, TECHNOLOGIES, PERSPECTIVES

VERSION 1.0 JANUARY 5, 2013 R2-ATL MOTOR MOUNT KIT ASSEMBLY GUIDE ASTROMECH DRIVE SYSTEM

T.C. MARMARA UNIVERSITY FACULTY of ENGINEERING COMPUTER ENGINEERING DEPARTMENT

Chapter 10 Digital PID

Workshops with Little Equipment and One Computer Tips & Hints

Double Dog Dare-A-Thon FAQs

Functional Specification Document. Robot Soccer ECEn Senior Project

Responding to Voice Commands

I like to call this robot a rover, as I tried to pattern it after NASA s designs. Figure 1-1 shows the general outline of the finished rover.

Mobile Audio Designs Monkey: A Tool for Audio Augmented Reality

An individual LEAP Response is required for this event and must be submitted at event check-in (see LEAP Program).

Manga (Level 1) Course Title: Manga (Level 1) Age Group: 12-18

When the phone rings for you: how to handle the interview scheduling call

Haptics in Military Applications. Lauri Immonen

SRV02-Series Rotary Experiment # 3. Ball & Beam. Student Handout

This document will provide detailed specifications, BOM information, and assembly instructions for the Official Competition Field.

happiness.* BY BRYAN IRWIN AND ALIZA LEVENTHAL

Air Marshalling with the Kinect

ASSIGNMENT THE HUMAN FIGURE

Formation and Cooperation for SWARMed Intelligent Robots

Student Ability Success Center (SASC) Procedures for Receiving Test Accommodations. effective 8/9/18

Evolutions of communication

KINECT HANDS-FREE. Rituj Beniwal. Department of Electrical Engineering Indian Institute of Technology, Kanpur. Pranjal Giri

GLACIER S EDGE COUNCIL

Transcription:

SUBMITTED BY Deborah Kasner Jackie Christenson Robyn Schwartz Elayna Zack May 7, 2013 1 P age

TABLE OF CONTENTS PROJECT OVERVIEW OVERALL DESIGN TESTING/PROTOTYPING RESULTS PROPOSED IMPROVEMENTS/LESSONS LEARNED REQUIREMENTS COMPLIANCE COST 2 P age

PROJECT OVERVIEW We have designed, tested, and built a mechatronic system incorporating a roughly anthropomorphic "concierge" character that can interact intelligently with passers- by. The concierge responds to visual and audible cues in its environment, understands a constrained subset of spoken English, and produces synthetic gestures and speech appropriate to the situation. It also provides useful information (such as directions, time, event schedules, etc.) to the user when requested to do so. Since the concierge includes features intended to enhance its entertainment value, it has a cartoonish or "retro" appearance, rather than naturalistic facial expressions. The system is designed to be installed at a specific location in the Penn engineering complex (Levine Lobby). The customer's gave us the following qualitative requirements for the system: Must recognize and acknowledge interlocutors on the basis of a visual image Must hear and interpret audible speech commands within an appropriate context, including but not limited to (1) requests for directions, (2) requests for simple information such as time or date, and (3) requests for information about the University and the Department Must generate responses to questions via synthetic speech Must include a roughly anthropomorphic, animated torso Must employ appropriate "body language" as part of the human- machine interaction Should employ some form(s) of facial expression, not necessarily anthropomorphic, as part of the interaction Must deal appropriately with problem situations (such as crowds, noise, unintelligible speech, etc), either by "giving up" gracefully or switching to alternative I/O methods Should incorporate entertainment value in all or most aspects (motion, speech, gesture, expression, etc.) 3 P age

To the customer, the display of the machine and its ease of use are key. They desire it to be very clear to whom the concierge is speaking. They also came up with a list of extra features that were desired, but not necessarily required, including a backup system in case of audio failure (such as a keyboard) and a display screen to play videos and presentations, both of which are included in the final system. After a period of significant development had passed, we formed the following quantitative metrics for the system. These requirements were written considering what is expected for normal conversation as well as the results of testing with a variety of users. Must detect a single interlocutor within a two foot radius Must form an appropriate response to a well- posed request in under 15 seconds Must sustain a conversation longer than one minute Must correctly understand a well- posed request with a 75% probability OVERALL DESIGN The design can be best understood when broken down in two separate categories; its mechanical design and its coding/software attributes. Mechanical Design Through many design iterations the final design arrived at for the outer appearance and the mechatronic aspects of the robot were as follows: Ø The Head 4 P age

The head of the robot was a box of laser cut MDF pieces press fit together to form a cubic shape of approximately 6 in x 8.5 in x 5.5 in. The entirety of the head was spray- painted silver with eyes, nose, and mouth painted on the faceplate of the box. Within the eyes 14 circular cutouts allowed us to place blue LEDs and control them via a phidget contained inside the head. Through a series of different MatLab functions we were able to use there LEDs to simulate different eye movements such as winking, looking left or right, when appropriate within the conversation. Additionally, the head was seated on the 3 PVC neck and able to turn left and right through the use of a continuous rotation servo that was fixed in the neck but free to rotate within the head. Lastly, also contained within the heads were the speakers through which the robot s responses were projected. Ø The Torso The torso of the robot was also constructed of silver spray- painted press fit MDF pieces. Its upper third was a trapezoidal shape for aesthetic reasons, while the lower part consisted of a rectangular area that contained a large cutout section so that the computer screen would be visible to users. Ø The Arms The robot s arms emerged from the trapezoidal section of the torso and were constructed of silver spray- painted 2 PVC pipes fitted with end cap hands. An MDF plate with a laser cut press fit for the head of the servo was attached to one end of each arm, with this end connected to the servo and the body of the servo held steady by supports within the torso, both of the arms were able to simulate a waving motion when instructed to do so by the MatLab code. Ø The Frame The frame and outer housing of the robot were all constructed in the machine shop by adjoining caster wheels, aluminum extrusions, plywood, Birchwood, and acrylic sheets. 5 P age

Coding The programming section of the project includes both sections of code that has been adapted for our purposes and original code. Tasks that had to be adapted from open source code we found on the Mathworks Website, included Motion Detection, Speech Recognition, and Text to Speech. While these functions were based on open source code, they were heavily adapted in order to work with our original code. The layout of how all of our functions work together can be seen in the diagram shown to the right. We had four main areas of original code, loading the libraries, analyzing the speech input, searching the libraries, and the robot s reactions. Ø Loading Functions Each loading function loaded data from excel spreadsheets, parsed out the data, and assigned to the appropriate field of the corresponding dictionary. The dictionaries were complex data structures, where made of entries in an arraylist, with each entry having multiple data fields. There were dictionaries for locations and free speech words. Ø Analyze The analyze function determined which conversation mode to be in, which dictionary to look in, and which search and compare functions were needed for the conversation. The function would also take care of how the computer needed to respond given the result of the search and compare functions. The 6 P age

conversation mode options were free form and twenty questions. Ø Search and Compare We had a wide variety of search and compare functions for the various dictionaries. Each search function took in the word that was said by the user, and compared it to relevant keywords in the dictionary. When there was a match, there would be a flag indicating that a match was found and the word existed in the dictionary. There was a search function for the free form word dictionary, as well as a probability search and compare, which considered the percent similarity of the words. There was also a separate compare function for the answers of the twenty questions and a search function for the location dictionary. Ø Reactions To ensure that the reactions occurred alongside the speech, we programmed the reactions in such a way that they were tied to the answers. When typing up anything we wanted the computer to say, we added in special symbols such as ~,#, or &. The reaction function parsed out these segments, sent the text to the Talk function, and made the appropriate reactions based on the symbol code. These method of symbols determined the movement of the arms, as well as eye moment. As far as head movements, there were three positions it could go to, left, right, center, and it would rotate through these every 10 seconds. TESTING AND PROTOTYPING RESULTS MATLAB Program Testing Results Once all the main functionality was implemented, we set up user testing, which focused on how the program would interact with the average user. We asked a variety of people, each with a different level of experience with the engineering buildings, expectation of a robot, and interest in the device. Each user had a conversation with the program, and was free to explore the conversation options on their own. Afterwards, we asked the user to rate a series of factors on a scale of one to ten. These factors were: appropriate responses to inquiry, responsiveness to inquiry, entertainment value, ease of use, and helpfulness. We also 7 Page

recorded the conversation options and topics that the user took advantage of and asked them to let us know what they would have wanted the conversation to include. [All scores out of 10] Metric Average Score Minimum Score Maximum Score Appropriate Responses 7.75 7 9 Responsiveness 5 3 7 Helpfulness 8 8 8 Entertaining 6.5 5 8 Ease of Use - Smalltalk Portion 4 4 4 Ease of Use - Questions Portion 7.25 7 8 8 P age

The most common conversation topics were directions to places within the engineering complex and information about the complex itself. The most highly suggested conversation topics to include were time and date, history of the engineering school, and directions to other places on campus. From this testing we concluded that the program is helpful, and when the concierge recognizes a word, it is able to successfully formulate an appropriate response. It has some issues with repeating itself before understanding what the user wants, but is decently entertaining and easy to get directions from. Areas of improvement are the smalltalk portion of the conversation and the concierge s responsiveness to a variety of inputs. Additionally, we noted that this test- conversation was with a computer only (which lacks the moving parts and facial expressions of the final concierge), which may significantly affect the entertainment value. After going through our results, it is clear that we have fulfilled our metrics. Overall, the concierge responds appropriately to users questions, and while there are sometimes cases in which it asks a question multiple times, the user is usually able to get to the section of the conversation he or she is looking for. Users found the concierge helpful and generally easy to use after a small adjustment period. The entertainment value changed depending on the user, as it is based on the user s interest in technology. Most people only experienced the portion of the conversation in which the program asks the user questions. The people who did experience the free- form conversation in which the concierge identifies keywords and has Smalltalk, found the that the vocabulary was limited, which slightly decreased the reality of their discussion with it. However, they still found the discussion to be adequately informative and entertaining. Based on this user feedback, we added many terms to the concierge s vocabulary and underwent a second round of user testing. The feedback from this second round confirmed that our additions did increase the reality of conversations between the user and the concierge, which was our main goal. Mechanical Design Prototyping Results In line with the requirement to have our robot appear approachable, friendly, and entertaining, we held an aesthetic focus group to get feedback on the different designs for the face. We laser cut different sizes/styles of both heads and faces and then asked various students and professors how 9 P age

approachable/realistically- sized each design was. In general, we found students and professors alike responded best to a smiling face, and stated that they would more willing to approach a machine with a life- sized head and a friendly expression. In compliance with user responses, we changed the head dimensions (greater width), we chose a very friendly expression, and used a trapezoid for the torso instead of a cube in order to create shoulders. We also conducted tests on the concierge s internal mechanical design, which included a system of 3 servomotors and a phidget with 7 LED inputs. Our first round of testing showed that the servos could not provide the torque necessary to rotate the arms through the full range of motion we desired. It also showed that our original design would not move the head in the manner we had envisioned. After these conclusions, we redesigned the internal mechanisms of both the head and the torso. Our new head design placed the servomotor inside the neck instead of its original location inside the head. This enabled the correct head motion as well as more than enough room for the phidget and the speakers within the head. Our new torso/arm design gave the arms a smaller range of motion, as well as a smaller size, which the servos were able to provide enough torque to move the arms through. These new designs changed the external look of the concierge very slightly so that we did not have to undergo a second round of user testing, as we kept the external features very nearly the same as the design that users had responded positively to during the first round of testing. PROPOSED IMPROVEMENTS/LESSONS LEARNED Although our project meets all of the metrics we outlined, there are areas that we feel could have been improved had we had more time and resources. One of these areas is the concierge s speech recognition ability. Although using the Microsoft API speech software allowed for the accomplishment of our goals, spending more money on a better software program would have improved the user experience. Had we a larger budget, this would have been a highly desirable action to take. Had we had more time, we would also have liked to add pictures and/or videos to go along with the concierge s directions. This would greatly 10 P age

increase the user s entertainment value as well as their ease of use, as it would help them to memorize the directions. In terms of mechanical design, had we more resources we would have made a more exact/polished frame for the concierge to sit within. Since the material for this area was very costly, we only had enough money in our budget to order the exact amount we needed. As there ended up being a few slight errors in our measurements, a few of the panels didn t fit together as exactly as they could have, which made for a less precise overall image. We also did not have a large enough budget to order enough acrylic to completely enclose the concierge, which forced us to eliminate the back acrylic panel from our design. Though this did not decrease its aesthetic quality too much, we would have fixed this had we the funds to do so. Had we a larger budget we also may have purchased servomotors that could provide the concierge with the torque it required to move its arms through the originally decided range of motion. These servos are larger and thus more expensive than the ones we ended up using, but the torque they provide could have give the arms more realistic movement ability. The final thing we would have liked to do better was the time we gave ourselves for integration. A few extra weeks of integration would have been helpful. We know any amount of time we gave ourselves to integrate would have been used, however there were a few issues that could have been worked out, if we had allowed extra time. REQUIREMENTS COMPLIANCE Following is a list of the customer s requirements and an identification of how each requirement was met. Must recognize and acknowledge interlocutors on the basis of a visual image o Through the use of a webcam, the concierge identifies when there are people within its immediate environment, and when enough movement is made within a few feet of it (when a specific user approaches) it acknowledges that user and begins a conversation 11 P age

Must hear and interpret audible speech commands within an appropriate context, including but not limited to (1) requests for directions, (2) requests for simple information such as time or date, and (3) requests for information about the University and the Department o The speech recognition, analyze function, and search and compare functions worked together to be able to recognize and interpret any command given by the user. See coding description for more information. Must generate responses to questions via synthetic speech o We adapted an open source speech to text function, which we used in order to generate synthetic speech. Must include a roughly anthropomorphic, animated torso o The torso was created from MDF, and was painted silver to looks robot- like. A trapezoidal shape was used to give the appearance of shoulders. Slots were cut within these shoulder plates to allow for arms, which move in correlation to the concierge s conversation Must employ appropriate "body language" as part of the human- machine interaction o The head and arms move appropriately according to the concierge s conversation with a user. For instance, when giving directions, if it tells a user to turn left it moves the corresponding arm. It also waves both arms in greeting or when saying goodbye to users. Should employ some form(s) of facial expression, not necessarily anthropomorphic, as part of the interaction o The eyes move appropriately according to the concierge s conversation with a user. For instance, when giving directions, if it tells a user to turn left it looks to the (user s) left. It also blinks its eyes regularly throughout its conversations to appear realistic. Must deal appropriately with problem situations (such as crowds, noise, unintelligible speech, etc), either by "giving up" gracefully or switching to alternative I/O methods o We implemented a back- up system, where the computer automatically switch to typing given a certain number of misunderstood input commands. Should incorporate entertainment value in all or most aspects (motion, speech, gesture, expression, etc.) 12 P age

o The motion of the head and arms served this entertainment function in addition to the many different eye movement through use of the LED lights. In addition to physical motions and gestures for entertainment value, several humorous responses were programmed into the code. COST Purchases: Item Quantity Total Cost ($) Birch Wood 1 8.37 Plywood 1 30.89 2 PVC 4 17.68 PVC Endcaps 2 2.52 3 PVC 1 6.97 Wheels 4 50.00 Servo Motor 7 200.00 13 P age

Video Camera & Mic 1 20.00 Acrylic Plates 4 300.00 LED Lights 30 30.00 Bookstore Supplies 10 50.00 Computer Connection Supplies 2 70.00 Speakers 4 100.00 Extension Cables 3 25.00 Paint Supplies 6 70.00 Poster Printing 1 120.00 Screws 24 40.00 Allotted Team Funds: $300 x 4 = $1,200 Total Spending: $1,141.43 14 P age

Though we spent a great deal of our budget we were still able to stay within our allotted funds. This was a considerable accomplishment for our team given that the large size of our project made the purchasing of just simple construction materials very expensive. Ultimately, the reason we were able to keep our spending down a great deal and remain within our $1,200 budget is that we were able to receive many donations and items on loan. Key costly items that we had access to in this manner include the phidget, aluminum extrusions, computer, monitor, keyboard, MATLAB and Microsoft API software. 15 P age