Suggestions | Excel & Python |
Preface Fibonacci Simulation Database Web Plotting results UDF This website's continuity |
Excel is part of MS Office. VBA is the programming language in MS Office. Python is a programming language that has been developed since the early 90's. Unlike VBA Python is an Open Source programming language. The number of Python developers exceeds those of VBA by far. Python has many sub-libraries that can be complementary to the options in MS-Office. Python can calculate with indefinite large numbers unlike Excel for instance. Many Python programmers would like a better integration of Python and Excel. In the Python platform several attempts have been made to link Python to Excel. An important one being XLwings. XLwings is part of the Anaconda distribution of Python: https://www.anaconda.com/download/ I researched XLwings to assess its value in linking Python and Excel. XLwings creates an own Python sub library (xlwings.py) in which it rewrites all Excel options in Python. I don't consider that a fruitful path to follow. I tried to develop an alternative in which: - Python can be started from/by Excel - arguments from Excel can be passed to the Python code - Python's result will be stored in a form that is manageable for Excel: a csv-file or graphic - the Excel file is linked to the Python output by queries - the user can view/adapt the Python code in Excel. In order to compare the XLwings approach to mine I rewrote XLwings' examples. You can find them here: https://www.xlwings.org/examples Advantages in my approach: - no need for a separate Python library - the way the Python output will be used takes place in Excel, not 'hardcoded' in Python code - Python will only be used for what Excel can't do: Python is supplementary to Excel - the use of Python output in Excel is flexible - the use of Excel-events eliminates buttons in a worksheet and makes changes instantaneous - the Python tab in the Ribbon contains all (i.e. very few) buttons for recalculation/updating/refreshing - no special Add-In required - the Python-code can be viewed/adapted in a userform in Excel: no special Python editor is required - all options in Excel (actual and in the future) are available; no need to wait for implementation of new options in a Python library Excel as Python GUIThe VBA options in Excel provide you the means to create a VBA Userform as Grafical User Interface(GUI) for Python.Python has no builtin GUI. The Python-library with which you can create a GUI isn't very userfriendly to put it mildly. So why not using the builtin graphical VBA Userform options ? I integrated the examples 'Fibonacci', 'simulation', 'plot' and 'database' into the workbook 'Python GUI.xlsb'. The web example uses the pivottable and slicers as GUI; a userform isn't applicable in that case. The UDF example is meant for worksheets only. As soon as the workbook opens the userform will be shown. The Userform contains a Multipage with 5 tabs. All Pythoncode has been integrated in 1 Python-file: 'Excel.py'. The functions that will be called from Excel have separate names in the Python file. If Python requires extra libraries to be loaded it will do so as soon as the Function has been called. So the memory won't be burdened by libraries the function doesn't use. The Python code writes its results to a 'csv'-file or a graphical file. Sometimes the VBA code reads the csv-file directly. In some cases the csv-file will be read by a querytable. Querytables have an 'own' very hidden sheet ('database' and 'simulation'). The examples: - a Fibonacci array - a MonteCarlo-simulation - the use of a music database - web scraping and analysis of those data in a pivottable - 'plotting': graphical representation of calculations that Excel doesn't contain - the use of Python in UDFs Start Install Python in order to run the examples. Python has momentarily 2 'flavours': Python 2.x and 3.x. The x is a version number. The examples on this page run in both versions, the 'database' excepted. The database needs Python 3.x. If Python has been installed correctly you can use "py -2" to start Python 2.x and "py -3" to launch Python 3.x. The examples use this way to start Python. ExamplesA ZIP-file contains an example.Each Zip-file consists of - an Excel *.xlsb file - a Python *.py file - a *.csv file Every Excel-file contains 'Python' tab in the Ribbon Py edit button to show/amend the Python code button(s) to run macros NB. Keep in mind: a Userform only shows the Python code as plain text The Userform is not a Python-editor with accessories. StepsUnzip the zip-file in a separate, own directory.Open the Excel file. The button(s) in the Python-tab start the macros that trigger the Python-code. The result will be written into the worksheet. The first time the code runs it might be somewhat slower since libraries have to be loaded into memory. Excel as Python GUI Python GUI Fibonacci Fibonacci Enter in cell A2 how many Fibonacci numbers you want to generate. The VBA code in the worksheet_change event triggers the Python code. The Fibonacci numbers pop up in column B. Enter in cells C2 and D2 which numbers will constitute the double-sum: 2*(C2+D2) The VBA code in the worksheet_change event triggers the Python code. The result will be shown in cell B1. You can switch between calculations using the buttons 'Double' and 'Fibonacci' in the Python Tab in the Ribbon. The Excel file contains 1 querytable (in cell B1) for several Python calculations. The use of Python to calculate the double sum is only better than Excel for large numbers Excel can't handle. Have for example a look at the result for 600 Fibonacci-numbers. Python-file def main(n,n_csv):
c_00 = '' a, b = 0, 1 for it in range(n): a, b = b, a + b c_00 += str(a) +'\n' open(n_csv,'w').write(c_00) def d_sum(x, y, n_csv): open(n_csv,'w').write(str(2 * (x + y))+'\n \n ') Enter the parameters for the simulation in range B1:B7. Click button 'Refresh' in the Python tab in the Ribbon. The query in the hidden worksheet 'data' reads the results of the calculation in Python. The graph in sheet 'simulation' uses the query as DataSource. Refreshing the querytable means updating the graph. At the opening of the workbook the connection of the Querytable will be adapted to the csv-file's current location. The VBA code to trigger Python is in the codemodule of Sheet1. You can move the table in sheet1 to every location in the worksheet. The button 'Py edit' in the Python tab shows the Python code. You can inspect, amend and save the code. Python file import numpy as np def main(n_0,n_1,n_2,n_3,n_4,n_5,n_csv): n_3=np.log(1 + float(n_3)) n_4=float(n_4) n_6=float(float(n_1)/n_2) perc_selection = [5, 50, 95] # hardcoded price = np.zeros((n_2 + 1, n_0)) percentiles = np.zeros((n_2 + 1, 3)) price[0,:] = n_5 percentiles[0,:] = n_5 for t in range(1, n_2 + 1): rand_nums = np.random.randn(n_0) price[t,:] = price[t-1,:] * np.exp((n_3 - 0.5 * n_4**2) * n_6 + n_4 * rand_nums * np.sqrt(n_6)) percentiles[t, :] = np.percentile(price[t, :], perc_selection) sy=[t* n_6 for t in range(0,n_2+1)] open(n_csv,'w').write('') for (it,it1,it2) in zip(percentiles,price,sy): open(n_csv,'a').write(str(it[0])+"," +str(it[1])+"," + str(it[2]) +','+str(it1[0]) +','+str(it2) +'\n') Database Database The Zip file contains: - Excel file database.xlsb - Python file database.py - csv files combo.csv and list.csv - Python database chinook.sqlite The Python code uses the Python library 'sqlite'. Excel file The file contains 2 querytables Querytable 'combo' in the hidden worksheet 'source'. It contains the inventory of all musical genres in the database. The Querytable is connected to the csv file 'combo.csv'. File combo.csv contains the results of the function 'combo' in the Python file 'database.py' This function filters the musical genres in database 'chinook.sqlite'. The Combobox 'C_00' in sheet 'playlist' is filled by querytable 'combo' in sheet 'source'. Querytable 'list' in sheet 'playlist'. It contains all the tracks in the database that match the selected musical genre in the combobox in sheet 'playlist'. The querytable is connected to the csv file 'list.csv' File list.csv contains the results of the function 'list' in the Python file 'database.py' This function filters the tracks in database 'chinook.sqlite' that match the selected musical genre in Combobox 'C_00' in sheet 'playlist'. Opening the workbook will adapt the connection of the Querytables to the csv-file's current location. Use button 'combo' in the Python tab in the Ribbon to update the musical genres in Combobox 'C_00'. The button 'Playlist' in the Python tab selects the matching tracks. It is more sensible to trigger the Python code after the selection in Combobox 'C_00'. You will find that code in the 'C_00_change' event. The button 'Py edit' in the Python tab shows the Python code. You can inspect, amend and save the code. The Python file The VBA code to trigger Python is in the codemodule of Sheet1. The Python code illustrates 2 methods to pass a SQL-string in Python: 1. the SQL-string is 'hard coded' in the Python code: the function 'list' 2. the SQL-string is passed as argument from Excel to the Python code: the function 'combo' import sqlite3 def list(n_db,n_sql,n_csv): cursor = sqlite3.connect(n_db, detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES).cursor() cursor.execute( """ SELECT t.Name AS Track, alb.Title AS Album, art.Name AS Artist, t.Composer FROM PlaylistTrack pt INNER JOIN Track t ON pt.TrackId = t.TrackId INNER JOIN Album alb ON t.AlbumId = alb.AlbumId INNER JOIN Artist art ON alb.ArtistId = art.ArtistId WHERE PlaylistId = ? """, (n_sql,)) open(n_csv,'w').write(','.join(it[0] for it in cursor.description)+'\n') open(n_csv,'a').write('\n'.join(','.join(str(it1).replace(',',' ') for it1 in it) for it in cursor)) def combo(n_db,n_sql,n_csv): cursor = sqlite3.connect(n_db, detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES).cursor() open(n_csv,'w').write('\n'.join(it[1] for it in cursor.execute(n_sql))) Web data Web This file isn't a XLwings example. The URL contains approx. 417000 records at the time this code was written. If the URL doesn't exist anymore you can replace it in the VBA macro 'M_refresh'. The Python code uses the Python library 'requests'. The sheet 'data' contains a querytable 'data'. The query is connected to the file 'data.csv'. The Python code writes the results of the 'webscraping' into 'data.csv'. The sheet 'data' can be made invisible. The event code in ThisWorkbook_Open adapts the Querytable's connection to the location of the csv file 'data.csv'. The sheet 'pivottable' contains a pivottable. The 'Data Source' of this pivottable is the querytable in sheet 'data'. Every time the webdata are being updated the pivottable will be refreshed too. The button 'Refresh' in de Python tab refreshes the webdata. The VBA macro 'M_refresh' to triggger Python is in the codemodule of Sheet1. The button 'Py edit' in the Python tab shows the Python code. You can inspect, amend and save the code. Python-bestand import requests def load_data(n_url,n_csv): open(n_csv, 'w', newline="").write(requests.get(n_url).text) Plotting results Plot The Python code uses 2 Python libraries: - 'Numpy' for mathematical calculation - 'Matplotlib' for graphical representation of mathematical results. Enter in sheet 'streamplot' in cell B1 the parameter value (integer) for calculation in Numpy. Start the Python code clicking the button 'Draw' in the Python tab in the Ribbon. The Python code creates the graphical file 'graph.png' as result. The VBA macro 'M_plot' adds the graphical file into the worksheet. The VBA macro 'M_plot' is in the macromodule of sheet 'streamplot'. The button 'Py edit' in the Python tab shows the Python code. You can inspect, amend and save the code. Python file import numpy as np import matplotlib.pyplot as plt def main(n,n_png): # Based on: http://matplotlib.org/users/screenshots.html#streamplot Y, X = np.mgrid[-3:3:100j, -3:3:100j] U = -1 + n * X**2 + Y V = 1 - n * X - Y**2 fig, ax = plt.subplots(figsize=(6, 4)) strm = ax.streamplot(X, Y, U, V, color=U, linewidth=2, cmap=plt.cm.autumn) fig.colorbar(strm.lines) fig.savefig(n_png) UDF UDF One of XLwings' examples is a worksheet with UDFs. A combination of UDF and Python is not very fortunate. UDFs will be recalculated more often then strictly necessary. Even if a change takes place in another open workbook. I think it would be better to use the method implemented in the Fibonacci example. In that case recalculation only takes place if one of the constituents of the calculation has been altered. All other calculations will be left alone. That saves a lot of calculation time and makes the code faster and more robust. I strongly advise not to combine UDFs and Python. Nevertheless for illustration purposes I rewrote the XLwings example to show how the same results can be accomplished without XLwings. The ZIP file contains: - Excel file UDF.xlsb - 4 csv files - Python file udf.py The file 'UDF.xlsb' contains 4 querytables: 'data_001 ...data_004'. These queries are connected to 4 CSV files 'data_001.csv ....data_004.csv' Three querytables are part of sheet 'UDF'; querytable 'data_001' is in the hidden sheet 'data'. The 'workbook_Open' event code adapts the connections between the csv files and the querytables. UDFs can only be part of a macromodule; in this case module 'python_udf'. UDFs will be triggered by every change in the worksheet. That why the Python tab in the Ribbon doesn't contain any button to start macro's. The button 'Py edit' in the Python tab shows the Python code. You can inspect, amend and save the code. The XLwings example The first UDF in the XLwings example retrieves the workbook.name. That seems rather farfetched: the Excel formula =cell("filename") get it as well. I would rather restrict the use of Python to cases Excel doesn't supply. If you prefer a UDF I'd suggest this (much faster) one: Function F_snb()
F_snb=thisworkbook.fullname
End FunctionThe function 'correl' uses the Python library Pandas (part of the Anaconda distribution). I wasn't able to get 'correl' working. The other functions only use standard Python code. Please notice that the result will be written alongside the UDF-formula in the worksheet. Python file import pandas as pd def double_sum(x, y, n_csv): open(n_csv,'w').write(str(2 * (x + y)) + '\n \n') def add_one(sn, n_csv): open(n_csv,'w').write('\n'.join(','.join(str(it1+1) for it1 in it) for it in sn)) def matrix_mult(sn, sp,n_csv): open(n_csv,'w').write('\n'.join(','.join(str(sn[x][y]*sp[x][y]) for y in range(len(sn))) for x in range(len(sn)))) def correl(sn,n_csv): st=pd.dataframe(sn).corr() open(n_csv,'w').write('\n'.join(','.join(str(it1+1) for it1 in it) for it in st)) |