Articles

Apache Spark in Microsoft Fabric

If you have used Spark in Azure Synapse, prepare to be pleasantly surprised with the compute experience in Microsoft Fabric as Spark compute starts a lot faster because the underlying technology has changed. The Data Engineering and Data Science Fabric experiences include a managed Spark compute, which like previous Spark compute charges you when it is in use. The difference is the nodes are reserved for you, rather than allocated when you start the compute which results in compute starting in 30 seconds or less versus the 4 minutes of waiting it takes for Azure Synapse compute to start.  If you have different capacity needs that a default managed Spark compute will not provide, you can always create a custom pool.  Custom pools are created in a specific workspace, so you will need Administrator permissions on the workspace to create them. You can choose to make the new pool your default pool as well, so it will be what starts in the workspace.

Writing Spark Code in Fabric

If you are writing code in Spark, the two languages you will most likely be using are Python or Scala, but you could also chose Java, Scala, R, or ANSI SQL. Notice that unlike with Azure Synapse, .Net is not included as a language you can use, which is an interesting development. The other thing to keep in mind when writing SQL code in Spark is you will be writing ANSI SQL, not TSQL which you use in Lakehouses and SQL endpoints within Fabric. While TSQL is ANSI compliant, I realized the extent of the differences when trying to use some of the DATEPART TSQL commands as they have underscores in Spark and you use instr instead of TSQLs CHARINDEX. The differences are minor and stackoverflow or copilot can help you with the differences. Just remember that you may not be able use the exact same code as in the SQL endpoint and you will be fine.

Lakehouse Integration and Autogenerated Code

Like all Fabric experiences, the lakehouse is an integral part of the coding experience. When you create a new notebook, the first step is to add a lakehouse. Once it has been added, you can drag and drop elements inside of the notebook and fabric will write the code for you. Literally this code block shown below was created when I dragged over the table publicholidays into the notebook.

Autogenerated Spark dataframe using the clicky-draggy method

Generating code with Copilot in Spark

Fabric in Spark includes a library called chat-magics, and this library includes AI features which you can incorporate in your code if you have copilot enabled in your tenant.  There are a few administrative steps you need to include to make that work.  To enable copilot the first step is to see if it is supported in your Fabric tenant as it is not available everywhere.  Check the list to make sure it is possible. Also you will need to pay for the feature as Copilot is not available as part of the free trial and you will need a Fabric F64 SKU or a P1 capacity to use it. Once you have validated you can use Copilot, you will want to go to the Administrative settings and enable Copilot in your tenant, as shown below.

Fabric Copilot Admin settings

Once Copilot is enabled and active, you will be able to enable it by clicking on the copilot icon on the far right of the screen.  If you don’t see it, click on the ellipse, the three dot menu where Microsoft hides all the cool stuff and you will see the icon in a dropdown menu.

Chat-magics: Copilot spark help

Here are 6 Chat magic commands designed to help you with your code.

%%chat – Designed to provide answers for you regarding items in your code such as variables
%%describe – Provides a summary of the contents of a dataframe
%%code  – Explain what code you want written and copilot will generate it for you
%%add_comments – Most people forget to comment their code, and if this is you you can have AI generate meaningful comments for you.
%%fix_errors – Using this command, copilot will try to fix dependance, configuration and resource allocation errors for you.

In my next post I will provide examples of how to use chat magic commands in Fabric.

Yours Always,

Ginger Grant

Data aficionado et Data Raconteur

 

Incorporating Cognitive Services

There has been a lot of very advanced research on developing algorithms which can analyze facial expressions, voice authentication and language understanding. Microsoft has decided to make this research available by creating a series of products which allow people to incorporate advanced research into their applications.  The cognitive service that I investigated first was the Language Understanding Intelligent Service [LUIS].

Teaching the Computer to Understand Text with Cognitive Services

There is a very good example of how to make LUIS understand text here.  In the sample, you can click on a button containing text or enter text free form. What LUIS does with the text is shown on the grey box on the right, JSON script is returned displaysLUISScreenthe score LUIS gave to the intent “TurnOn”. LUIS does not turn on lights for you, but there is a really good example of some code where people are using LUIS to control their home automation.

Before you can implement a solution with LUIS you need to define the intents which are listed in the JSON script.  An intent is an action you have defined. Some example intents might be to Find a Hotel in Seattle or Tell me Amazon’s Stock Price or a lot of the other things people have Alexa do for them. The scope of what you would have LUIS do for you is a lot more focused, as the number of Intents allowed is limited, and you will have to write the code to perform the Intent.

Steps to Understand LUIS Text

As right now LUIS is in preview mode, and therefore free, this is a great time to start learning the new technology. To get started, you will need to create an account at www.luis.ai, and once that is complete, create a New App. When creating an app, on of the number of different supported languages must be selected. No key is required, as a free key will be generated later. An app requires Intents, LUIS evaluates the text to see if it indicates the Intent is likely. The text that is evaluated is compared to an Utterance, which you also need to create. For example, if you have an Intent for “SearchHotels” an Utterance would be Find me a Hotel. While this is a perfectly good Utterance, there is no reference to a location, which is something pretty important when looking for hotels.  Entities are the descriptive parts of the Utterance. If I said show me hotels in [$geography] and replaced what was in the brackets with a city, then I would have a better idea of what hotels to return.  If I add the pre-built entity geography, then LUIS will be able to describe a location, which of course can be added to my utterances if I put square brackets[] around the entity name and a dollar sign $ in front of the name. I can add words people may use to describe a location with Features. If I add the word “near”, I can add the synonym catty-corner so that LUIS will understand that that word means “near”.  Once I have a complete list of Intents, Utterances, and Features, I can train the application for it to be tested and used in a component.

Applying Cognitive Services in Real World ApplicationsHotelBot2

Once I have a customized App created for LUIS to understand text, I used that to create a BOT to explore how I could use the rules I implemented in the website. I used the Microsoft Bot Framework to create an application which calls the LUIS component I created.  To reference the code created in LUIS, the application contains a reference key which provides the ability to call LUIS  from within my application.  As I don’t write much about C# code here, I didn’t include the code here, but I would be happy to share it if you would like.  Just drop me a line and I will post it.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur