Batch Scoring
Create date parameter
dbutils.widgets.text("varReportDate", "19000101")
ReportDate = dbutils.widgets.get("varReportDate")
print(ReportDate)Connect to storage
storage_account_name = "mystorage"
storage_account_access_key = ""
file_location = "wasbs://<container>@mystorage.blob.core.windows.net/myfiles/data_" + ReportDate + ".csv"
file_type = "csv"
spark.conf.set(
"fs.azure.account.key."+storage_account_name+".blob.core.windows.net",
storage_account_access_key)Define input schema
from pyspark.sql.types import *
schema = StructType([
StructField("ReportingDate", DateType(), True),
StructField("id", StringType(), True),
StructField("x1", IntegerType(), True),
StructField("x2", DoubleType(), True)
])Read in new data
Load in transformation pipeline and model
Score data using the model
Write data back out to storage
Last updated
Was this helpful?