Markdown Notebook

Explore and run machine learning code with Kaggle Notebooks Using data from no data sources. Markdown is a lightweight markup language that you can use to add formatting elements to plaintext text documents. Created by John Gruber in 2004, Markdown is now one of the world’s most popular markup languages. Using Markdown is different than using a WYSIWYG editor. In an application like Microsoft Word, you click buttons to format words. The markdown cell in Jupyter Notebook can display six levels of heading. For making a heading, start the syntax with # followed by a space and then the text. This will make the heading of level 1 – The biggest. 3.5 R Markdown Notebooks. As mentioned in Section 2.2 of the R Markdown Definitive Guide (Xie, Allaire, and Grolemund 2018), there are several ways to compile an Rmd document. One of them is to use R Markdown Notebooks, with the output format htmlnotebook, e.g.

Just a short post following a recent question I got from my delivery team… Are there any best practices for structuring our Databricks Notebooks in terms of code comments and markdown? Having done a little Googling I simply decided to whip up a quick example that could be adopted as a technical standard for the team going forward.

For me, one of the hardest parts of developing anything is when you need to pick up and rework code that has been created by someone else. That said, my preferred Notebook structure shown below is not about technical performance or anything complicated. This is simply for ease of sharing and understanding, as well as some initial documentation for work done.

In my example I created a Scala Notebook, but this could of course apply to any flavour.

The key things I would like to see in a Notebook are:

Markdown Notebook Windows

Markdown Headings – including the Notebook title, who created it, why, input and output details. We might also have references to external resources and maybe a high level version history. I created this in a table via the markdown and injected a bit of HTML too for the bullet points.
Common Code – where boiler plate code is used I like to have this in a set of common Notebooks that are ran to establish a framework for any proceeding content.
Widgets – if required I expect all widgets to be created and referenced near the top of the Notebook. Maybe with some defensive checks on values passed.
Cell Titles – all cells within the Notebook should include a title to support there propose in the overall script.
Logging – in most cases we should have a framework for outputting log information to a central location, via Application Insights or even just a SQLDB table.
Comments – probably the most important thing to include in all code is the comments. This should not be text for the sake of it. Or text that simply translates from code to English. This should be small amounts of narrative explaining why. What was the thinking behind a certain line or condition. If hard coded values have to be used, what do they mean in the wider business logic. When writing comments in code, I think to myself, what would the next person that reads this want to know?

Graphically these are shown in my simple example Notebook below. Free feel to also download this Scala file from my GitHub repository. Notebook Example.scala

If you think this was useful, or if you know of other best practices for structuring a Notebook I’d be interested to know so please leave a comment.

Markdown Notebook Github

Many thanks for reading.

3.5 R Markdown Notebooks

R Markdown Notebook

As mentioned in Section 2.2 of the R Markdown Definitive Guide(Xie, Allaire, and Grolemund 2018), there are several ways to compile an Rmd document. One of them is to use R Markdown Notebooks, with the output format html_notebook, e.g.,

Markdown Table Jupyter Notebook

When you use this output format in RStudio, the Knit button on the toolbar will become the Preview button.

The main advantage of using notebooks is that you can work on an Rmd document iteratively in the same R session. You can run one code chunk at a time by clicking the green arrow button on each chunk, and you will see the text or plot output in the editor. When you click the Preview button on the toolbar, it only renders the Rmd document to an HTML output document containing the output of all code chunks that you have already executed. The Preview button does not execute any code chunks. By comparison, when you use other output formats and hit the Knit button, RStudio launches a new R session to compile the whole document (hence all code chunks are executed at once), which usually takes more time.

If you do not like RStudio’s default behavior of showing the output of code chunks inline when you run them individually, you can uncheck the option “Show output inline for all R Markdown documents” from the menu Tools -> Global Options -> R Markdown. After that, when you run a code chunk, the output will be shown in the R console instead of inside the source editor. You can also set this option for an individual Rmd document in its YAML metadata: