Packages

class FhirDeIdentification extends Transformer with HasFeatures with LightDeIdentificationParams with DeidModelParams with CheckLicense with HasInputCol with HasOutputAnnotationCol with ParamsAndFeaturesWritable

A Spark Transformer for de-identifying FHIR resources according to configurable privacy rules.

Overview

Performs field-level obfuscation on FHIR JSON documents using FHIR Path expressions. Supports R4, R5, and DSTU3 FHIR versions with type-aware de-identification strategies. Additionally, supports different parser types (JSON, XML) for FHIR resources.

Example:
  1. Basic Pipeline Usage

    val deid = new FhirDeIdentification()
      .setInputCol("raw_fhir")
      .setOutputCol("deidentified")
      .setMode("obfuscate")
      .setMappingRules(Map("Patient.birthDate" -> "Date"))
    
    val pipeline = new Pipeline().setStages(Array(deid))
See also

FHIR Specification

Linear Supertypes
ParamsAndFeaturesWritable, DefaultParamsWritable, MLWritable, HasOutputAnnotationCol, HasInputCol, CheckLicense, DeidModelParams, BaseDeidParams, LightDeIdentificationParams, HasFeatures, Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. FhirDeIdentification
  2. ParamsAndFeaturesWritable
  3. DefaultParamsWritable
  4. MLWritable
  5. HasOutputAnnotationCol
  6. HasInputCol
  7. CheckLicense
  8. DeidModelParams
  9. BaseDeidParams
  10. LightDeIdentificationParams
  11. HasFeatures
  12. Transformer
  13. PipelineStage
  14. Logging
  15. Params
  16. Serializable
  17. Serializable
  18. Identifiable
  19. AnyRef
  20. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FhirDeIdentification()
  2. new FhirDeIdentification(uid: String)

    uid

    a unique identifier for the instanced Annotator

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. def $$[T](feature: StructFeature[T]): T
    Attributes
    protected
    Definition Classes
    HasFeatures
  5. def $$[K, V](feature: MapFeature[K, V]): Map[K, V]
    Attributes
    protected
    Definition Classes
    HasFeatures
  6. def $$[T](feature: SetFeature[T]): Set[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  7. def $$[T](feature: ArrayFeature[T]): Array[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  8. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. val ageRanges: IntArrayParam

    List of integers specifying limits of the age groups to preserve during obfuscation

    List of integers specifying limits of the age groups to preserve during obfuscation

    Definition Classes
    BaseDeidParams
  10. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  11. def checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit
    Definition Classes
    CheckLicense
  12. def checkValidScope(scope: String): Unit
    Definition Classes
    CheckLicense
  13. def checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  14. def checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  15. final def clear(param: Param[_]): FhirDeIdentification.this.type
    Definition Classes
    Params
  16. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  17. val consistentAcrossNameParts: BooleanParam

    Param that indicates whether consistency should be enforced across different parts of a name (e.g., first name, middle name, last name).

    Param that indicates whether consistency should be enforced across different parts of a name (e.g., first name, middle name, last name). When set to true, the same transformation or obfuscation will be applied consistently to all parts of the same name entity, even if those parts appear separately.

    For example, if "John Smith" is obfuscated as "Liam Brown", then:

    • When the full name "John Smith" appears, it will be replaced with "Liam Brown"
    • When "John" or "Smith" appear individually, they will still be obfuscated as "Liam" and "Brown" respectively, ensuring consistency in name transformation.

    Default: true

    Definition Classes
    BaseDeidParams
  18. def copy(extra: ParamMap): FhirDeIdentification
    Definition Classes
    FhirDeIdentification → Transformer → PipelineStage → Params
  19. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  20. val customFakers: MapFeature[String, Array[String]]

    The dictionary of custom fakers to specify the obfuscation terms for the entities.

    The dictionary of custom fakers to specify the obfuscation terms for the entities. You can specify the entity and the terms to be used for obfuscation.

    Definition Classes
    LightDeIdentificationParams
  21. val dateEntities: StringArrayParam

    List of date entities.

    List of date entities. Default: Array("DATE", "DOB", "DOD")

    Definition Classes
    LightDeIdentificationParams
  22. val dateFormats: StringArrayParam

    Format of dates to displace

    Format of dates to displace

    Definition Classes
    BaseDeidParams
  23. val days: IntParam

    Number of days to obfuscate the dates by displacement.

    Number of days to obfuscate the dates by displacement. If not provided a random integer between 1 and 60 will be used

    Definition Classes
    BaseDeidParams
  24. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  25. def deidentify(jsonStr: String, rules: Map[String, String]): String
    Attributes
    protected
  26. def deidentify(jsonStr: String): String
  27. def deidentify_list(jsonStrs: ArrayList[String]): List[String]
  28. def deidentify_list(jsonStrs: Array[String]): Array[String]
  29. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  30. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  31. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  32. def explainParams(): String
    Definition Classes
    Params
  33. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  34. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  35. val fakerLengthOffset: IntParam

    It specifies how much length deviation is accepted in obfuscation, with keepTextSizeForObfuscation enabled.

    It specifies how much length deviation is accepted in obfuscation, with keepTextSizeForObfuscation enabled. Value must be greater than 0. Default is 3.

    Definition Classes
    BaseDeidParams
  36. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  37. val fhirVersion: Param[String]

    Set FHIR version to de-identify.

    Set FHIR version to de-identify. Supported versions are "). Default is R4.

  38. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  39. val fixedMaskLength: IntParam

    Select the fixed mask length: this is the length of the masking sequence that will be used when the 'fixed_length_chars' masking policy is selected.

    Select the fixed mask length: this is the length of the masking sequence that will be used when the 'fixed_length_chars' masking policy is selected.

    Definition Classes
    LightDeIdentificationParams
  40. val genderAwareness: BooleanParam

    Whether to use gender-aware names or not during obfuscation.

    Whether to use gender-aware names or not during obfuscation. This param effects only names. If value is true, it might decrease performance. Default: False

    Definition Classes
    BaseDeidParams
  41. def generateFakeBySameLength(wordToReplace: String, entity: String): String

    obfuscating digits to new digits, letters to new letters and others remains the same

    obfuscating digits to new digits, letters to new letters and others remains the same

    Definition Classes
    DeidModelParams
  42. def generateFakeBySameLengthUsingHash(wordToReplace: String, entity: String): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  43. def get[T](feature: StructFeature[T]): Option[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  44. def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  45. def get[T](feature: SetFeature[T]): Option[Set[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  46. def get[T](feature: ArrayFeature[T]): Option[Array[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  47. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  48. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  49. def getConsistentAcrossNameParts: Boolean

    Gets the value of consistentAcrossNameParts.

    Gets the value of consistentAcrossNameParts.

    returns

    Boolean value indicating if consistency is enforced across name parts

    Definition Classes
    BaseDeidParams
  50. def getCustomFakers: Map[String, Array[String]]

    Gets customFakers param.

    Gets customFakers param.

    Attributes
    protected
    Definition Classes
    LightDeIdentificationParams
  51. def getDateEntities: Array[String]

    Gets dateEntities param.

    Gets dateEntities param.

    Definition Classes
    LightDeIdentificationParams
  52. def getDateFormats: Array[String]
    Definition Classes
    BaseDeidParams
  53. def getDays: Int
    Definition Classes
    BaseDeidParams
  54. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  55. def getEntityField(annotation: Annotation): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  56. def getFakeByHashcode(fakes: Seq[String], wordToReplace: String, entity: String, seed: Int): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  57. def getFakeWithSameSize(fakes: Seq[String], wordToReplace: String, entity: String, lengthDeviation: Int, seed: Int): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  58. def getFakerLengthOffset: Int

    Gets fakerLengthOffset param

    Gets fakerLengthOffset param

    Definition Classes
    BaseDeidParams
  59. def getFakersEntity(entity: String, result: String): Seq[String]
    Definition Classes
    DeidModelParams
  60. def getFhirVersion: String

    Gets the value of fhirVersion

  61. def getFixedMaskLength: Int

    Gets fixedMaskLength param.

    Gets fixedMaskLength param.

    Definition Classes
    LightDeIdentificationParams
  62. final def getInputCol: String
    Definition Classes
    HasInputCol
  63. def getKeepMonth: Boolean

    Gets keepMonth param

    Gets keepMonth param

    Definition Classes
    LightDeIdentificationParams
  64. def getKeepTextSizeForObfuscation: Boolean

    Gets keepTextSizeForObfuscation param

    Definition Classes
    BaseDeidParams
  65. def getKeepYear: Boolean

    Gets keepYear param

    Gets keepYear param

    Definition Classes
    LightDeIdentificationParams
  66. def getLanguage: String
    Definition Classes
    BaseDeidParams
  67. def getMappingRules: Map[String, String]
  68. def getMappingRulesAsStr: String
  69. def getMaskEntity(entityClazz: String): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  70. def getMaskStatus(entityClass: String): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  71. def getMaskingPolicy: String

    Gets maskingPolicy param.

    Gets maskingPolicy param.

    Definition Classes
    LightDeIdentificationParams
  72. def getMaxSentence(annotations: Seq[Annotation]): Int
    Attributes
    protected
    Definition Classes
    DeidModelParams
  73. def getMode: String

    Gets mode param.

    Gets mode param.

    Definition Classes
    LightDeIdentificationParams
  74. def getObfuscateDate: Boolean

    Gets obfuscateDate param

    Gets obfuscateDate param

    Definition Classes
    LightDeIdentificationParams
  75. def getObfuscateRefSource: String
    Definition Classes
    BaseDeidParams
  76. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  77. final def getOutputCol: String
    Definition Classes
    HasOutputAnnotationCol
  78. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  79. def getParserType: String

    Gets the value of parserType

  80. def getRegion: String

    Gets region param.

    Gets region param.

    Definition Classes
    LightDeIdentificationParams
  81. def getSameLengthFormattedEntities(): Array[String]
    Definition Classes
    BaseDeidParams
  82. def getSeed(): Int
    Definition Classes
    BaseDeidParams
  83. def getSelectiveObfuscationModes: Option[Map[String, Array[String]]]

    Gets selectiveObfuscationModes param.

  84. def getUnnormalizedDateMode: String

    Gets unnormalizedDateMode param.

  85. def getUseShiftDays: Boolean

    Gets useShiftDays param.

    Gets useShiftDays param.

    Definition Classes
    LightDeIdentificationParams
  86. def getValidAgeRanges: Array[Int]

    Gets validAgeRanges parameter

    Gets validAgeRanges parameter

    Definition Classes
    FhirDeIdentificationDeidModelParams
  87. def handleCasing(originalFake: String, wordToReplace: String): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  88. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  89. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  90. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  91. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  92. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  93. final val inputCol: Param[String]
    Definition Classes
    HasInputCol
  94. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  95. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  96. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  97. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  98. val keepMonth: BooleanParam

    Whether to keep the month intact when obfuscating date entities.

    Whether to keep the month intact when obfuscating date entities. If true, the month will remain unchanged during the obfuscation process. If false, the month will be modified along with the year and day. Default: false.

    Definition Classes
    LightDeIdentificationParams
  99. val keepTextSizeForObfuscation: BooleanParam

    It specifies whether the output should maintain the same character length as the input text.

    It specifies whether the output should maintain the same character length as the input text. the output text will remain the same if same length is available, else length might vary.

    Definition Classes
    BaseDeidParams
  100. val keepYear: BooleanParam

    Whether to keep the year intact when obfuscating date entities.

    Whether to keep the year intact when obfuscating date entities. If true, the year will remain unchanged during the obfuscation process. If false, the year will be modified along with the month and day. Default: false.

    Definition Classes
    LightDeIdentificationParams
  101. val language: Param[String]

    The language used to select the regex file and some faker entities.

    The language used to select the regex file and some faker entities. 'en'(English),'de'(German), 'es'(Spanish), 'fr'(French), 'ar'(Arabic) or 'ro'(Romanian) Default:'en'

    Definition Classes
    BaseDeidParams
  102. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  103. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  104. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  105. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  106. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  107. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  108. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  109. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  110. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  111. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  112. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  113. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  114. val mappingRules: MapFeature[String, String]

    FHIR field de-identification rules for primitive type obfuscation.

    FHIR field de-identification rules for primitive type obfuscation.

    Overview

    Defines how specific FHIR elements should be de-identified using FHIR Path syntax. Supports all FHIR primitive types with built-in obfuscation strategies.

  115. def maskEntity(annotation: Annotation, maskingPolicy: String, maskedEntity: String, fixedMaskLength: Int): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  116. val maskingPolicy: Param[String]

    Select the masking policy:

    Select the masking policy:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: 'entity_labels'
    Definition Classes
    LightDeIdentificationParams
  117. val mode: Param[String]

    Mode for Anonymizer ['mask' or 'obfuscate'].

    Mode for Anonymizer ['mask' or 'obfuscate']. Default: 'mask'

    • Mask mode: The entities will be replaced by their entity types.
    • Obfuscate mode: The entity is replaced by an obfuscator's term.
    Definition Classes
    LightDeIdentificationParams
    Example:
    1. Given the following text: "David Hale visited EEUU a couple of years ago"

      • Mask mode: "<PERSON> visited <COUNTRY> a couple of years ago"
      • Obfuscate mode: "Bryan Johnson visited Japan a couple of years ago"
  118. val nameEntities: Seq[String]
    Attributes
    protected
    Definition Classes
    DeidModelParams
  119. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  120. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  121. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  122. val obfuscateDate: BooleanParam

    When mode=="obfuscate" whether to obfuscate dates or not.

    When mode=="obfuscate" whether to obfuscate dates or not. This param helps in consistency to make dateFormats more visible. When setting to true, make sure dateFormats param fits the needs. If the value is true and obfuscation is failed, then unnormalizedDateMode will be activated. When setting to 'false', then the date will be masked to <DATE>. Default: false

    Definition Classes
    LightDeIdentificationParams
  123. def obfuscateNameEntity(originalName: String, keepTextSize: Boolean, lengthDeviation: Int, namePartsMemory: Map[String, String]): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  124. val obfuscateRefSource: Param[String]

    The source of obfuscation to obfuscate the entities.

    The source of obfuscation to obfuscate the entities. The values ar the following: 'file': Takes the entities from the obfuscatorRefFile 'faker': Takes the entities from the Faker module 'both': Takes the entities from the obfuscatorRefFile and the faker module randomly.

    Definition Classes
    BaseDeidParams
  125. def onWrite(path: String, spark: SparkSession): Unit
    Attributes
    protected
    Definition Classes
    ParamsAndFeaturesWritable
  126. final val outputCol: Param[String]
    Attributes
    protected
    Definition Classes
    HasOutputAnnotationCol
  127. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  128. val parserType: Param[String]

    parser type to parse the FHIR string.

    parser type to parse the FHIR string. Supported types are ").

  129. val random: SecureRandom
    Attributes
    protected
    Definition Classes
    DeidModelParams
  130. val region: Param[String]

    With this property, you can select particular dateFormats.

    With this property, you can select particular dateFormats. This property is especially used when obfuscating dates. You can decide whether the first part of 11/11/2023 is a day or the second part is a day when obfuscating dates.

    • The values are following:
    • 'eu' for European Union
    • 'us' for USA
    Definition Classes
    LightDeIdentificationParams
  131. val sameLengthFormattedEntities: StringArrayParam

    List of formatted entities to generate the same length outputs as original ones during obfuscation.

    List of formatted entities to generate the same length outputs as original ones during obfuscation. The supported and default formatted entities are: "phone", "fax", "contact," "id", "idnum", "bioid", "medicalrecord", "zip", "vin", "ssn", "dln", "plate", "license", "IRS", "CFN", "account".

    Definition Classes
    BaseDeidParams
  132. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  133. val seed: IntParam

    It is the seed to select the entities on obfuscate mode.

    It is the seed to select the entities on obfuscate mode. With the seed, you can reply to an execution several times with the same output.

    Definition Classes
    BaseDeidParams
  134. def selectFakeFromAllFakes(wordToReplace: String, entityClass: String, maskedEntity: String, allFakes: Seq[String]): String
    Attributes
    protected
    Definition Classes
    DeidModelParams
  135. val selectiveObfuscationModes: StructFeature[Map[String, Array[String]]]

    The dictionary of modes to enable multi-mode deidentification.

    The dictionary of modes to enable multi-mode deidentification.

    • 'obfuscate': Replace the values with random values.
    • 'mask_same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.
    • 'entity_labels': Replace the values with the entity value.
    • 'mask_fixed_length_chars': Replace the name with the asterix with fixed length. You can also invoke "setFixedMaskLength()"
    • 'skip': Skip the entities (intact)

    The entities which have not been given in dictionary will deidentify according to setMode()

    Definition Classes
    LightDeIdentificationParams
  136. def set[T](feature: StructFeature[T], value: T): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  137. def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  138. def set[T](feature: SetFeature[T], value: Set[T]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  139. def set[T](feature: ArrayFeature[T], value: Array[T]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  140. final def set(paramPair: ParamPair[_]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    Params
  141. final def set(param: String, value: Any): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    Params
  142. final def set[T](param: Param[T], value: T): FhirDeIdentification.this.type
    Definition Classes
    Params
  143. def setAgeRanges(mode: Array[Int]): FhirDeIdentification.this.type

    List of integers specifying limits of the age groups to preserve during obfuscation

    List of integers specifying limits of the age groups to preserve during obfuscation

    Definition Classes
    BaseDeidParams
  144. def setConsistentAcrossNameParts(value: Boolean): FhirDeIdentification.this.type

    Sets the value of consistentAcrossNameParts.

    Sets the value of consistentAcrossNameParts.

    value

    Boolean flag to enforce consistency across name parts

    returns

    this instance

    Definition Classes
    BaseDeidParams
  145. def setCustomFakers(value: HashMap[String, List[String]]): FhirDeIdentification.this.type
    Definition Classes
    LightDeIdentificationParams
  146. def setCustomFakers(value: Map[String, Array[String]]): FhirDeIdentification.this.type

    Sets the value of customFakers.

    Sets the value of customFakers. The dictionary of custom fakers to specify the obfuscation terms for the entities. You can specify the entity and the terms to be used for obfuscation.

    Example:

    new LightDeIdentification()
     .setInputCols(Array("ner_chunk", "sentence")).setOutputCol("dei")
     .setMode("obfuscate")
     .setObfuscateRefSource("custom")
     .setCustomFakers(Map(
         "NAME" -> Array("George", "Taylor"),
         "SCHOOL" -> Array("Oxford", "Harvard"),
         "city" -> Array("ROMA")
     ))
    Definition Classes
    LightDeIdentificationParams
  147. def setDateEntities(value: Array[String]): FhirDeIdentification.this.type

    Sets the value of dateEntities.

    Sets the value of dateEntities. Default: Array("DATE", "DOB", "DOD")

    Definition Classes
    LightDeIdentificationParams
  148. def setDateFormats(s: Array[String]): FhirDeIdentification.this.type

    Format of dates to displace

    Format of dates to displace

    Definition Classes
    BaseDeidParams
  149. def setDays(k: Int): FhirDeIdentification.this.type

    Number of days to obfuscate the dates by displacement.

    Number of days to obfuscate the dates by displacement. If not provided a random integer between 1 and 60 will be used

    Definition Classes
    BaseDeidParams
  150. def setDefault[T](feature: StructFeature[T], value: () ⇒ T): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  151. def setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  152. def setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  153. def setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  154. final def setDefault(paramPairs: ParamPair[_]*): FhirDeIdentification.this.type
    Attributes
    protected
    Definition Classes
    Params
  155. final def setDefault[T](param: Param[T], value: T): FhirDeIdentification.this.type
    Attributes
    protected[org.apache.spark.ml]
    Definition Classes
    Params
  156. def setFakerLengthOffset(value: Int): FhirDeIdentification.this.type

    Sets fakerLengthOffset param

    Sets fakerLengthOffset param

    Definition Classes
    BaseDeidParams
  157. def setFhirVersion(value: String): FhirDeIdentification.this.type

    Sets the value of fhirVersion.

    Sets the value of fhirVersion. The FHIR version to de-identify. Supported versions are "). Default is R4.

  158. def setFixedMaskLength(value: Int): FhirDeIdentification.this.type

    Sets the value of fixedMaskLength.

    Sets the value of fixedMaskLength. This is the length of the masking sequence that will be used when the 'fixed_length_chars' masking policy is selected.

    Definition Classes
    LightDeIdentificationParams
  159. def setGenderAwareness(value: Boolean): FhirDeIdentification.this.type

    Whether to use gender-aware names or not during obfuscation.

    Whether to use gender-aware names or not during obfuscation. This param effects only names. If value is true, it might decrease performance. Default: False

    Definition Classes
    BaseDeidParams
  160. def setInputCol(value: String): FhirDeIdentification.this.type

    Set the input column name.

    Set the input column name. The input column should contain the FHIR string.

  161. def setKeepMonth(value: Boolean): FhirDeIdentification.this.type

    Sets whether to keep the month intact when obfuscating date entities.

    Sets whether to keep the month intact when obfuscating date entities. If true, the month will remain unchanged during the obfuscation process. If false, the month will be modified along with the year and day. Default: false.

    Definition Classes
    LightDeIdentificationParams
  162. def setKeepTextSizeForObfuscation(value: Boolean): FhirDeIdentification.this.type

    Sets keepTextSizeForObfuscation param

    Definition Classes
    BaseDeidParams
  163. def setKeepYear(value: Boolean): FhirDeIdentification.this.type

    Sets whether to keep the year intact when obfuscating date entities.

    Sets whether to keep the year intact when obfuscating date entities. If true, the year will remain unchanged during the obfuscation process. If false, the year will be modified along with the month and day. Default: false.

    Definition Classes
    LightDeIdentificationParams
  164. def setLanguage(s: String): FhirDeIdentification.this.type

    The language used to select the regex file and some faker entities.

    The language used to select the regex file and some faker entities. 'en'(English),'de'(German), 'es'(Spanish), 'fr'(French), 'ar'(Arabic) or 'ro'(Romanian). Default:'en'

    Definition Classes
    BaseDeidParams
  165. def setMappingRules(value: HashMap[String, String]): FhirDeIdentification.this.type
  166. def setMappingRules(value: Map[String, String]): FhirDeIdentification.this.type

    Sets FHIR field de-identification rules for primitive type obfuscation.

    Sets FHIR field de-identification rules for primitive type obfuscation.

    Overview

    Defines how specific FHIR elements should be de-identified using FHIR Path syntax. Supports all FHIR primitive types with built-in obfuscation strategies.

    Rule Format
    Map(
      "ResourceType.field.path" -> "SupportedEntityClass",
    )
    value

    A mapping between FHIR paths and target primitive types. Keys must use standard FHIR Path notation (dot-delimited). Values must be one of the supported de-identification entity classes or given as a custom list.

    Example:
    1. Basic Usage

      new FhirDeIdentification()
        .setMappingRules(Map(
           "Patient.birthDate" -> "Date",
           "Patient.name.given" -> "Name",
           "Patient.telecom.value" -> "Email",
           "Patient.address.city" -> "City",
        ))
    Exceptions thrown

    If:

    • Unsupported primitive type provided
    • Malformed FHIR path detected
    • Non-primitive field targeted
    Note

    Important Constraints: 1. Paths are case-sensitive and must match FHIR element names exactly 2. Array elements should use standard FHIR Path syntax (e.g., Patient.name.given) 3. Only primitive types are supported for de-identification

    See also

    FHIR Path Specification

  167. def setMaskingPolicy(value: String): FhirDeIdentification.this.type

    Select the masking policy:

    Select the masking policy:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: 'entity_labels'
    Definition Classes
    LightDeIdentificationParams
  168. def setMode(m: String): FhirDeIdentification.this.type

    Mode for Anonymizer ['mask'|'obfuscate'].

    Mode for Anonymizer ['mask'|'obfuscate']. Default: 'mask'

    • Mask mode: The entities will be replaced by their entity types.
    • Obfuscate mode: The entity is replaced by an obfuscator's term.
    Definition Classes
    LightDeIdentificationParams
    Example:
    1. Given the following text: "David Hale visited EEUU a couple of years ago"

      • Mask mode: "<PERSON> visited <COUNTRY> a couple of years ago"
      • Obfuscate mode: "Bryan Johnson visited Japan a couple of years ago"
  169. def setObfuscateDate(s: Boolean): FhirDeIdentification.this.type

    obfuscateDate param is not supported in FhirDeIdentification.

    obfuscateDate param is not supported in FhirDeIdentification. It is always true.

    Definition Classes
    FhirDeIdentificationLightDeIdentificationParams
    Exceptions thrown
  170. def setObfuscateRefSource(s: String): FhirDeIdentification.this.type

    The source of obfuscation to obfuscate the entities.

    The source of obfuscation to obfuscate the entities. The values are the following: 'file': Takes the entities from the obfuscatorRefFile 'faker': Takes the entities from the Faker module 'both': Takes the entities from the obfuscatorRefFile and the faker module randomly.

    Definition Classes
    BaseDeidParams
  171. final def setOutputCol(value: String): FhirDeIdentification.this.type
    Definition Classes
    HasOutputAnnotationCol
  172. def setParserType(value: String): FhirDeIdentification.this.type

    Sets the value of parserType.

    Sets the value of parserType. The parser type to parse the FHIR string. Supported types are ").

  173. def setRegion(value: String): FhirDeIdentification.this.type

    region param is not supported in FhirDeIdentification.

    region param is not supported in FhirDeIdentification. Please use dateFormats instead.

    Definition Classes
    FhirDeIdentificationLightDeIdentificationParams
    Exceptions thrown
  174. def setSameLengthFormattedEntities(entities: Array[String]): FhirDeIdentification.this.type

    List of formatted entities to generate the same length outputs as original ones during obfuscation.

    List of formatted entities to generate the same length outputs as original ones during obfuscation. The supported and default formatted entities are: PHONE, FAX, CONTACT, ID, IDNUM, BIOID, MEDICALRECORD, ZIP, VIN, SSN, DLN, LICENSE, PLATE, IRS, CFN, ACCOUNT.

    Definition Classes
    BaseDeidParams
  175. def setSeed(s: Int): FhirDeIdentification.this.type

    It is the seed to select the entities on obfuscate mode.

    It is the seed to select the entities on obfuscate mode. With the seed, you can reply to an execution several times with the same output.

    Definition Classes
    DeidModelParamsBaseDeidParams
  176. def setSelectiveObfuscationModes(value: HashMap[String, List[String]]): FhirDeIdentification.this.type
    Definition Classes
    LightDeIdentificationParams
  177. def setSelectiveObfuscationModes(value: Map[String, Array[String]]): FhirDeIdentification.this.type

    Sets the value of selectiveObfuscationModes.

    Sets the value of selectiveObfuscationModes. The dictionary of modes to enable multi-mode deidentification.

    • 'obfuscate': Replace the values with random values.
    • 'mask_same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.
    • 'entity_labels': Replace the values with the entity value.
    • 'mask_fixed_length_chars': Replace the name with the asterix with fixed length. You should also invoke "setFixedMaskLength()"
    • 'skip': Skip the entities (intact)

    The entities which have not been given in dictionary will deidentify according to setMode()

    Example:

    val deIdentification = new LightDeIdentification()
     .setInputCols(Array("ner_chunk", "sentence")).setOutputCol("dei")
     .setMode("mask")
     .setSelectiveObfuscationModes(Map(
         "OBFUSCATE" -> Array("PHONE", "email"),
         "mask_entity_labels" -> Array("NAME", "CITY"),
         "skip" -> Array("id", "idnum"),
         "mask_same_length_chars" -> Array("fax"),
         "mask_fixed_length_chars" -> Array("zip")
     ))
     .setFixedMaskLength(4)
    Definition Classes
    LightDeIdentificationParams
  178. def setUnnormalizedDateMode(mode: String): FhirDeIdentification.this.type

    The mode to use if the date is not formatted.

    The mode to use if the date is not formatted. Options: [mask, obfuscate, skip] Default: obfuscate

    Definition Classes
    LightDeIdentificationParams
  179. def setUseShiftDays(s: Boolean): FhirDeIdentification.this.type

    useShiftDays param is not supported in FhirDeIdentification.

    useShiftDays param is not supported in FhirDeIdentification. Please use days instead.

    Definition Classes
    FhirDeIdentificationLightDeIdentificationParams
    Exceptions thrown
  180. def shouldUseConsistentNameParts(entityClass: String): Boolean
    Attributes
    protected
    Definition Classes
    DeidModelParams
  181. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  182. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  183. def transform(dataset: Dataset[_]): DataFrame
    Definition Classes
    FhirDeIdentification → Transformer
  184. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  185. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  186. def transformSchema(schema: StructType): StructType
    Definition Classes
    FhirDeIdentification → PipelineStage
  187. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  188. val uid: String
    Definition Classes
    FhirDeIdentification → Identifiable
  189. val unnormalizedDateMode: Param[String]

    The mode to use if the date is not formatted.

    The mode to use if the date is not formatted. Options: [mask, obfuscate, skip] Default: obfuscate

    Definition Classes
    LightDeIdentificationParams
  190. val useShiftDays: BooleanParam

    Whether to use the random shift day when the document has this in its metadata.

    Whether to use the random shift day when the document has this in its metadata. DocumentHashCoder can create 'dateshift' based on the document. Default: false

    Definition Classes
    LightDeIdentificationParams
  191. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  192. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  193. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  194. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Inherited from ParamsAndFeaturesWritable

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from HasOutputAnnotationCol

Inherited from HasInputCol

Inherited from CheckLicense

Inherited from DeidModelParams

Inherited from BaseDeidParams

Inherited from HasFeatures

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

Members

Parameter setters

Parameter getters