{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# {docnum}`step ref` Structured Data Files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section we focus on reading from and writing to files with a row-column format, such as is found in comma-separated (csv) and tab-separated (tsv) data files.\n", "\n", "Although `numpy.loadtxt()` is suitable for this task, it is valuable to be able to write your own code solution." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing a Data File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us generate some data and write it in a csv format (comma-separated values). In general what you use as the separator (delimiter) for your data is up to you, but if we use a .csv file extension it's best to stick to the standard." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "#Generating data\n", "x = np.linspace(0, 2*np.pi)\n", "y = np.sin(x)\n", "z = np.cos(x)\n", "\n", "#Writing the data to file in csv format\n", "with open('data1.csv', 'w') as f:\n", " f.write('x,sin(x),cos(x)\\n') #Header\n", " \n", " for xx, yy, zz in zip(x, y, z):\n", " f.write(f'{xx},{yy},{zz}\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are not familiar with the string formatting used (`f'{xx},{yy},{zz}\\n'`) see {doc}`../00-basics/08-string-formatting`. Note that in this line (and also in the header) we have separated the values with commas." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the file extension **.csv** acts more as a hint for other software. There is no physical difference between a file we write with this extension or any other extension (including no extension). As long as the file mode is set to text (`'t'`), we are writing plain text files." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output of our data file **data1.csv** looks like:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "tags": [ "remove_input" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x,sin(x),cos(x)\n", "0.0,0.0,1.0\n", "0.1282282715750936,0.127877161684506,0.9917900138232462\n", "0.2564565431501872,0.25365458390950735,0.9672948630390295\n", "0.38468481472528077,0.3752670048793741,0.9269167573460217\n", "0.5129130863003744,0.49071755200393785,0.8713187041233894\n", "0.6411413578754679,0.5981105304912159,0.8014136218679567\n", "0.7693696294505615,0.6956825506034864,0.7183493500977276\n", "0.8975979010256552,0.7818314824680298,0.6234898018587336\n", "1.0258261726007487,0.8551427630053461,0.5183925683105252\n", "\n" ] } ], "source": [ "with open('data1.csv', 'r') as f:\n", " data1 = f.readlines()\n", "\n", "print(('{}'*10).format(*data1[:10]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or in a more presentable format:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "tags": [ "remove_input" ] }, "outputs": [ { "data": { "text/html": [ "
x | \n", "sin(x) | \n", "cos(x) | \n", "
---|---|---|
0.000000 | \n", "0.000000e+00 | \n", "1.000000 | \n", "
0.128228 | \n", "1.278772e-01 | \n", "0.991790 | \n", "
0.256457 | \n", "2.536546e-01 | \n", "0.967295 | \n", "
0.384685 | \n", "3.752670e-01 | \n", "0.926917 | \n", "
... | \n", "... | \n", "... | \n", "
5.898500 | \n", "-3.752670e-01 | \n", "0.926917 | \n", "
6.026729 | \n", "-2.536546e-01 | \n", "0.967295 | \n", "
6.154957 | \n", "-1.278772e-01 | \n", "0.991790 | \n", "
6.283185 | \n", "-2.449294e-16 | \n", "1.000000 | \n", "