In our application users will insert or upload data and then can view reports through the provided data. On the process data parsing, validating, and inserting are crucial process.
Here when getting data we can also save it in JSON data type in a column. That is the initial thing we did. But all the data manipulation stuffs we have to write code and after one point we were not able to maintain it.
# JSON
row = Data::Row.new(data: {first_name: "Jane", last_name: "Doe", date_of_birth: "2024-06-21"})
if row.save
...
else
...
end
Here if first_name is mandatory or has to be unique or date_of_birth should be valid date format, then we have to parse the JSON and then insert it. Rails has already done all the things in ActiveRecord. So we used it to save our time and write clean code.
Here are the below models we use
class Data::Table < ApplicationRepord
#name String
#status Integer
#data_type Interger
def create_postgres_table
tb_name = table_name
cols = data_columns
ActiveRecord::Schema.define do
create_table tb_name, id: :uuid, if_not_exists: true do |t|
cols.each do |col|
t.column col.pg_column_name.to_sym, col.pg_data_type
end
end
end
end
def delete_postgres_table
tb_name = table_name
ActiveRecord::Schema.define do
drop_table tb_name, if_exists: true
end
end
def table_name
"data_tables_#{Current.tenant_name}_#{name.pluralize}".to_sym
end
end
class Data::Column < ApplicationRecord
# data_table_id
# name String
# pg_column_name String(column name in database)
# date_type String, "string,date,number,float,array"
# required Boolean
# list_values Array(Text)
after_create_commit do
create_postgres_column
end
def pg_data_type
{
'Number' => 'integer',
'String' => 'string',
'Date' => 'date',
'Float' => 'float',
'Array' => 'array',
'DateTime' => 'datetime'
}[data_type]
end
private
def create_postgres_column
table_name = data_table.table_name
column = self
ActiveRecord::Schema.define do
add_column table_name, "#{column.pg_column_name}".to_sym, "#{column.pg_data_type}".to_sym
end
end
end
When data_table is created will bulk import data_columns
and then call data_table.create_postgres_table
and then on when user creates a column separately we will use data_column.create_postgres_column
Then in controller we have use the Entity class to insert, parse & validate data
class DataRowsController < BaseController
before_action :set_dependencies
before_action :set_row, only: [:show, :destroy, :update]
def create
@row = @entity.new(item_params)
if @row.valid?
@row.save
else
...
end
end
...
...
...
private
def item_params
params.require(:data_rows).slice(*@data_columns.pluck(:pg_column_name)).permit!.to_h
end
def set_row
@row = @entity.find_by(id: params[:id])
end
def set_dependencies
@data_table = Current.organization.data_tables.find_by(id: params[:data_table_id])
@data_columns = @data_table.columns
@entity = Entity.create_klass(@data_table.table_name, @data_columns)
end
end
Here now all the validations, callbacks, parsing are done by ActiveRecord. After moving to above now we even support foreign_keys, indexes(for filtering), unique_indexes, formulas(using callback) etc., Whatever we can do in rails now we were able to do it on run-time.
Before all the rows are stored in single table Data::Row
after a period the table grew over to millions of rows, Tenants who have small volume of data also got affected. Now all the data are stored in separate table most of the things got fast.
When creating new tables schema generation will be slow in development, For that rails also have method to skip tables when generating schema. Put this in application.rb
ActiveRecord::SchemaDumper.ignore_tables << /.+custom_data.*$/
Thank you for reading!