@@ -4,11 +4,11 @@ This respository was created in order to make available the HOTMapper, a tool th
## Data ##
The dataset "Matrícula" can be found at the link: [INEP](http://portal.inep.gov.br/web/guest/microdados) in the section "Censo Escolar".
The original open dataset can be found at the link: [INEP](http://portal.inep.gov.br/web/guest/microdados) in the section "Censo Escolar" and "Censo da Educação Superior".
The dataset "Local Oferta" can be found in the same link, but at the section "Censo da Educação Superior". Additionaly for increase the convenience, all data from "Local Oferta" is in the directory open_data.
Additionaly for increase the convenience, all data from "Local Oferta" is in the directory open_data.
**NOTE**: It's important that you ta
**NOTE**: It's important that you verify ith there is a column identifying the year of the dataset;
## Requirements ##
...
...
@@ -74,7 +74,7 @@ $ .\env\Scripts/activate
$ pip install-r requirements.txt
```
## Interface de linha de comando ##
## Command Line Interface ##
The CLI (Command Line Interface) uses the standart of the manage.py package, which means that to invoke a command you should use the following pattern:
...
...
@@ -131,6 +131,12 @@ You should use this command everytime a mapping protocol is updated.
The remap allows the creation of new columns, the drop of existent columns, the renaming of columns and the change of type of columns. Be aware that the bigger the table the bigger the usegae of RAM memory.
* generate_pairing_report: generate reports to compare data from diferent years.
```bash
...
...
@@ -145,3 +151,109 @@ The reports will be created in the folder "pairing"
```bash
$ python manage.py generate_backup
```
## Demo scenarios ##
In this Section we will explain how to execute the demo. Demo scenario 1 uses the dataset "local oferta", which is included in the directory open_data. Demo scenario 2 uses the dataset "matricula" which can be downloaded from the [INEP's Link ](http://portal.inep.gov.br/web/guest/microdados) in the section "Censo Escolar".
In both scnearios, we assume that you started the virtual environment as explained in Section `Installation - 5`
### Demo scenario 1: ###
This section contains the commands used in the scenario 1, which is the creation of a new data source and the inclusion of the corresponding data.
1) First we need to create the database, to do so execute the following command:
```bash
$ ./manage.py create localoferta_ens_superior
```
2) Now, as we already have the mapping protocol, we need to insert the open data in the data base. To do it we must execute the following commands:
**NOTE:** FILEPATH is the **_full path_** for the directory where the open data table is, for example (in a Linux environment): `/home/c3sl/HOTMapper/open_data/DM_LOCAL_OFERTA_2010`
This section contains the commands used in the scenario 2, which is the update of an existing data source.
1) First we need to create the database, to do so execute the following command:
```bash
$ ./manage.py create localoferta_ens_superior
```
2) Now, as we already have the mapping protocol, we need to insert the open data in the data base. To do it we must execute the following commands:
**NOTE:** FILEPATH is the **_full path_** for the directory where the open data table is, for example (in a Linux environment): `/home/c3sl/HOTMapper/open_data/DM_LOCAL_OFERTA_2010`
3) Change the matricula's mapping protocol. You can use the `matricula_remap.csv` (To do so, rename the current `matricula.csv` to something else and the `matricula_remap.csv` to `matricula.csv`). In that case, the only column that will change is the "fundamental_af" in the year 2013.
4) Run the remap command
```bash
$ ./manage.py remap matricula
```
The above command will update the table `Fonte` and the schema from the table matricula