Amazon Alexa + Raspberry PI + jasm.eu: Our first steps together

These were my requirements when I started this small project:

  • I want to be able to have Alexa understand any kind of commands I may find appropriate for my projects. (I.e. beyond things like “Alexa turn on the lights” or “Alexa turn off the heating”). Yes, eventually I may even want to control my Photo Wall with Alexa.
  • It should be fast to implement. Right now I’m happy with just a prototype to show “it works”
  • Even if it’s a prototype I don’t want it to be completely insecure. My Raspberry PI is and remains behind a firewall.

Notes before the actual project:

  • Speaking of security: You must understand this is only a prototype. Strong security was not in my list of requirements and it thus is NOT secure. In case you choose to follow this document, you do it on your risk; If you plan to extend this prototype for a “real” project, consider improving the security
  • Amazon accepts only httpS connections to a 3rd-party system and only on port 443. I therefore didn’t consider forwarding in the router the port 443 to my PI as I consider this too insecure. For the chosen alternative see below

 

0. The Ingredients

  • Echo Dot (I guess the other Echos would work, too) which is already configured, up and running
  • Raspberry PI running in private network, behind the firewall
  • Some knowledge of python and Unix shell

1. Big Pic

alexa1

2. Python CGIHTTPServer

  • create a folder on the PI (e.g. alexaServer). create a subfolder called cgi-bin
  • Perform python -m CGIHTTPServer 8000 in a Terminal while being in the folder “alexaServer”. This will start a simple WebServer on port 8000. If this command fails, search on the internet about what packages you need to run python on pi, and how to install them
  • If it worked, stop the server (Ctrl+C), as you’ll need it again a bit later.

3. Bash cgi-bin

  • Create a file on your PI containing the following content:
#!/bin/bash
echo "Cache-Control: no-cache, no-store, must-revalidate";
echo "Pragma: no-cache";
echo "Expires: 0";
echo "Content-Type: application/json;charset=UTF-8"
echo ""
echo "{ \"version\": \"1.0\", \"response\": { \"outputSpeech\": { \"type\": \"PlainText\", \"text\": \"Got it. Yes, your command was received.\"} }}"

(feel free to enter your text instead of mine. And make sure the JSON payload above is written on only one line without the line break my blog software enters in the block above)

  • For details on the payload, check Amazon. The payload above will just make Alexa “read” the message specified by you.
  • Save the file in cgi-bin folder created above. Call it for example alexa.sh
  • make the file executable: chmod u+x alexa.sh
  • Start again the python WebServer as mentioned above
  • To verify that your script works and outputs the JSON payload you need, use some tool to do a POST against your PI WebServer: http://<pi_address or name>:8000/cgi-bin/alexa.sh ; A suitable tool would be for example Postman for Chrome.

4. ngrok client

  • Download the “Linux ARM” version
  • unzip it
  • start it by calling ./ngrok http 8000 in a Terminal
  • You will see some https URL in the format: https://<some cryptic name>.ngrok.io Write it down. You’ll need it. Make sure to take the https one !
  • Test again in postman your cgi-bin script; This time over the external URL: https://<some cryptic name>.ngrok.io/cgi-bin/alexa.sh
  • Again a note here: As you can see from the test above, your PI is reachable from the Internet. Everyone knowing the <some cryptic name> would be able to connect to your WebServer. Make sure you keep the ngrok client up and running only as long as you really need it for your prototype. You can always stop it with Ctrl+C and start it when you need it again. Btw, after every start you’ll get a new <some cryptic name> and thus you’d need to adjust also your Alexa Skill (see below details). As I said in the beginning, this is intended to be just a prototype. For a real solution this is exactly the part to be solved.

5. Alexa Skill

  • Still following ? Congratulations ! You are almost done ! 🙂
  • Visit developer.amazon.com and Sign in or register
  • Go to Alexa area and Get Started with “Alexa Skills Kit”.
  • “Add a New Skill”
  • Choose “Custom Interaction Model”, choose your language and enter some name for your skill. For Invocation Name enter the name how Alexa should know she must trigger your new skill. For example you may enter Kermit. Push Next.
  • For the Intent Schema we’ll use something as simple as possible. Yet not that simple that we’re unable to see how powerful Alexa is:
{
 "intents": [{
 "intent": "PIPing",
 "slots": [{
 "name": "info",
 "type": "AMAZON.DATE"
 }]
 }]
}
  • For Sample Utterances, we’ll use:
PIPing {info}

and Next

  • In Configuration we select HTTPS, the relevant geographical region and enter the URL you wrote down above: https://<some cryptic name>.ngrok.io/cgi-bin/alexa.sh
  • Keep Account Linking on No and “Next”
  • For “SSL Certificate” choose the option ” My development endpoint is a sub-domain of a domain that has a wildcard certificate from a certificate authority”. Next.
  • In the Test area -> Text enter your command for Alexa in the Enter Utterance field. For example, you may enter:
Alexa, tell kermit today
  • Yes, I admit this is not very intelligent skill. It’s just a prototype to illustrate the roundtrip from Echo to Pi and back.
  • Push the “Ask …” button. You’ll see the request JSON being created, with the field “info” containing the “today” date. Which shows that the skill understood the term “today” and replaced it by a real date.
  • If everything works fine, you’ll see in the Service Response the JSON produced by the cgi-bin.
  • Now you are ready for the real test: speak with your echo device and say the text above: “Alexa, tell kermit today”. Alexa should respond you, as specified in the cgi-bin.
  • Troubleshooting: Should something go wrong with your test, you’re quite in trouble. There is no real info on what went wrong; At most you could try to look for details on your Alexa App on your mobile device. It might show some error messages on what went wrong. For example I saw “A connection could not be established to Resource [https://myip:myport] Type [HTTP] ” once when I tried to have the Skill connection to a port != 443.
  • You may stop at this step; As you can see, it works for you without going to the publishing steps; This certainly is not a skill you want to publish, right ?

6. Summary

You’re done. And probably now it makes full sense the big pic above with all the steps 1 – 4 for the request and vice-versa for the response.

For questions, comments, feedback, feel free to use the Raspberry PI Forum.

 

 

Advertisement
Posted in Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: