ISSN: 2378315X BBIJ
Biometrics & Biostatistics International Journal
Research Article
Volume 4 Issue 4  2016
Dynamic Connectivity Mapping of Electrocorticographic Data using Bayesian Differential Structural Equation Modeling
Price LR^{1}*, Vargas RV^{2}, Behroosmand R^{3}, Parkinson AL^{2}, Larson CR^{4}, Greenlee JDW^{5} and Robin DA^{2}
^{1}Professor Psychometrics & Statistics, Texas State University, USA
^{2}Research Imaging Institute, University of Texas Health Science Center, USA
^{3}Department of Communication Sciences and Disorders, University of South Carolina, USA
^{4}Department of Communication Sciences and Disorders, Northwestern University, USA
^{5}Department of Neurosurgery, University of Iowa Hospitals and Clinics, USA
Received: May 16, 2016  Published: September 14, 2016
*Corresponding author:
Larry RP, Professor  Psychometrics & Statistics, Texas State University, 601 University Drive, 325 ASB South, USA, Tel: 5127549021; Email:
Citation:
Price LR, Vargas RV, Behroosmand R, Parkinson AL, Larson CR, et al. (2016) Dynamic Connectivity Mapping of Electrocorticographic Data using Bayesian Differential Structural Equation Modeling. Biom Biostat Int J 4(4): 00102. DOI:
10.15406/bbij.2016.04.00102
Abstract
Advances in the sophistication of imaging techniques necessitate the development of techniques to model the neural and cognitive phenomena they represent. Using electrocorticographic (ECoG) data, we propose a datadriven approach using ordinary differential equations and Bayesian differential structural equation modeling (BdSEM) to model effective connectivity related to sensorimotor integration. First, we tested the regionbased covariance structure across subjects by each experimental condition to evaluate the tenability of pooling subject data to perform group level versus singlesubject analyses. Second, we applied a differential equation approach to model dynamic change originating from regional neuronal states across the data acquisition period. Finally, we employed an informationtheoretic search strategy to identify the optimal connectivity model within each experimental condition for a single subject. Results of subjectspecific (intraindividual) relationship maps include effective, contemporaneous and delayed effective connections of across different brain regions.
Abbreviations
ECoG: Electro Cortico Graphic; BdSEM: Bayesian Differential Structural Equation Modeling; ODEs: Ordinary Differential Equations; ROI: Region Of Interest; fMRI: functional Magnetic Resonance Imaging; MCMC: Markov Chain Monte Carlo; FDA: Functional Data Analysis; STG: Superior Temporal Gyrus; PostSTG: Posterior STG; MidSTG: Middle STG; BIC: Bayesian Information Criterion; AIC: Akaike Information Criterion; FF: Feed Forward; FB: Feed Back; ERP: Evoked Response Potential
Introduction
To date, imaging neuroscience and electrophysiology have provided a solid foundation for functional specialization as a principle of brain organization in humans. However, accurately modeling functional integration of specialized areas of the brain has proven to be a more difficult task [1]. For example, the human brain can be viewed as a holistic and dynamic system involving functionally specialized areas or regions that are related by effective connections during cognitive processing. Connectivity models (functional or effective) have become useful approximations to holistic and dynamic systems for understanding underlying relationships between regions of neural activation. For example, functional connectivity models provide relational maps based on statistical dependency (correlations) between remote neuro physiological events. Effective connectivity models involve estimating the influence that one neuronal system exerts over another, either at the synaptic or population level [1]. Furthermore, effective connectivity is activitydependent involving interactions among regions of the brain. Finally, modeling intrinsic connectivity involves capturing dynamic elements of the system reflecting the essential nature of the collective system.
One approach to modeling effective connectivity is based on intracranial electrocorticographic (ECoG) time series data. In ECoGbased studies, data are acquired from multicontact subdural electrodes implanted during surgical evaluation in epilepsy patients. ECoG provides a higher signal to noise ratio and temporal and spatial resolution than EEG due to the fact that ECoG signals are not influenced by the low conductivity of the skull and the fact that measurements are acquired in the vicinity of the underlying brain sources [2]. In clinical practice, ECoG has become the “gold standard” for defining epileptogenic zones [3]. The locations of ECoG electrodes implantation are determined using preoperative clinical data such as ictal semiology, ictal and interictal features on scalp EEG, and structural MRI findings [4]. The pattern of the subsequent ECoG recordings is then used for further localization of epileptogenic foci.
In this study, we present an innovative modeling approach for the development and estimation of human electrophysiological inferential connectivity maps using ECoG data. The aim of the method is to obtain accurate representations of underlying effective relationships while also considering intrinsic or dynamically changing aspects of regional components of brain activity. Specifically, we use ordinary differential equations within a structural equation modeling framework and a heuristic model search strategy to provide insights into electrodetoelectrode connectivity using ECoG. Here, our interest is based on modeling simultaneous or dynamic change in neural activity so as to provide unique insights into the causal nature of the relationships between brain regions in specific vocalization and auditory processes. Given the high spatial and temporal resolution of ECoG data, inferences about underlying neural functions as mapped on to brain anatomy allow for exquisite representation of both time and space in the human brain [3,5,6].
The organization of this manuscript is as follows. The first section briefly reviews SEM and the advantages of using Bayesian SEM in studies such as ours. The second section details the application of ordinary differential equations (ODEs) to measure simultaneous change in brain activity propagating from the level of neural activity upward to the region of interest (ROI) level. The third section introduces a model for Bayesian SEM based on differential equations (BdSEM) and describes the heuristic specification search procedure used to arrive at optimal models of effective connectivity. The fourth section outlines a simulation study conducted to evaluate the efficacy of the modeling strategy.
Structural equation modeling:A frequently used technique used for modeling populationlevel relationships among regions caused by neural activation in brain regions is structural equation modeling (SEM). SEM was first adapted to imaging data by McIntosh & GonzalesLima [7,8] to examine effective [9] relationships between regions of interests (ROIs). To date, SEM of neural connectivity has predominantly been applied to functional magnetic resonance imaging (fMRI). In part, this trend may reflect difficulties in source localization that are inherent in most electrophysiological data capture methods.
Application of SEM involves developing a set of simultaneous equations to estimate
 The regression coefficients between measurements on observed variables and associated latent variables (i.e. a measurement model) and
 Relationships among latent variables comprising a hypothesized model. SEM is based on the general linear and/or generalized nonlinear model and is very useful for complex or longitudinal data structures (e.g., multilevel random coefficients models, growth curve modeling, and differential equationbased manifest or latent change). Perhaps SEM is most often associated with confirmatory (theory confirming) modeling approaches. However, SEM can be used to conduct heuristic search techniques in high dimensional data structures to locate an optimal model from among a set of competing ones. Heuristic search techniques leverage informationtheoretic algorithms from the field of artificial intelligence (e.g., mathematicallybased approaches based on lagrangian heuristic/incomplete branchandbound algorithms). Application of SEM involves measuring the discrepancy between a parameterized causal structure of hypothesized relationships (e.g., between observed or latent variables or a combination of both) and estimated parameters from data as measured through a series of fit indices [10,11]. SEM employing Bayesian probability and statistics [12,13] has recently emerged and provides increased precision and flexibility in modeling scenarios with high dimensional data structures, small sample size and nonnormally distributed variables.
Statistical inference and bayesian learning:The history and development of Bayesian statistical methods are substantial and closely related to frequentist statistical methods. In some ways, Bayesian statistical thinking can be viewed as an extension of the traditional (i.e., frequentist) approach, in that it formalizes aspects of the statistical analysis that are left to uninformed judgment by researchers in classical statistical analyses [14]. Bayesian methods include data analytic techniques that are derived from the principles of Bayesian statistical inference. Statistical induction involves learning about the characteristics of a population from a subset of members of a particular population. Numerical values of populations are expressed as parameters (θ) while numerical values of the subset of the population are expressed as (y). Given the numerical values (y) in the subset or sample dataset, uncertainty is reduced about the population parameters. Quantifying this shift in uncertainty is the goal of Bayesian inference. The parameter space Θ is the set of all possible parameter values from which we wish to identify the value(s) that best reflect the true population parameters (e.g., regression weights in connectivity models). Bayesian learning involves a numerical formulation of the joint beliefs about y and θ, expressed as probability distributions over y and θ. In short, Bayesian learning involves the components listed below.
 Given each value
$\theta \in \Theta $
the prior distribution
$p(\theta )$
describes the belief that
$(\theta )$
represents the true population parameter.
 Given each value
$\theta \in \Theta $
and
$y\in Y$
the sampling model
$p(y\theta )$
captures the belief that
$y$
will be the outcome of a particular study if we knew
$\theta $
to be correct or true.
 After
$y$
is acquired, each numerical value of the posterior distribution
$p(y\theta )$
describes the belief that
$\theta $
is the true value having observed dataset
$y$
.
The posterior distribution for model parameters is derived using Bayes rule as in Equation 1.
$p(\theta y)=\frac{p(y\theta )p(\theta )}{{\displaystyle {\int}_{\Theta}p(y\tilde{\theta})p(\tilde{\theta})d\tilde{\theta}}}$
(1)
A core component of Bayesian modeling is accurately modeling the generating process (e.g., unknown causes modeled using a prior distribution) and unknown population parameters
$\theta $
to observed sensory data values (
$y$
). Bayesian statistical methods are particularly wellsuited for developing generative or recognition models of complex systems [1517] because the goal in Bayesian learning is to model the generating process that produced the observed data values. For example, in generative and recognition models, functions are applied thereby allowing a mechanism for mapping causes to sensory input. Specifically, the goal of generative modeling is to: “learn representations that are economical to describe but allow the input to be reconstructed accurately” [18]. The goal is to make inferences about the causes and learn the parameters. Bayesian probability provides a natural framework for linking unknown parameters and causes to observed data. Finally, in the classical school of probability, the sample data values are selected randomly with the statistics estimated being fixed point estimates of population parameters. Conversely, in the Bayesian school of probability and inference, the sample is fixed (i.e. not considered random) with the parameters estimated being random (e.g., obtained using Markov chain Monte Carlo [MCMC] resampling methods). Because our estimated parameters are random variables, we can make probabilistic statements about their certainty with a high level of precision.
In the Bayesian modeling approach, we view any unknown quantity (e.g., parameter) as random and these quantities are assigned a probability distribution (e.g., normal, Poisson, multinomial, geometric, etc.) that provides the impetus for generating a particular set of data. In this study, our unknown population parameters were modeled as being random and then assigned to a joint probability distribution. In this way, we were able to summarize our current state of knowledge regarding the model parameters. The samplingbased approach to Bayesian estimation provides a solution for the random parameter vector θ by estimating the posterior density or distribution of a parameter. This posterior distribution is defined as the product of the likelihood function (accumulated over all possible values of θ) and the prior density of θ [19]. In our case, a generative model is specified in terms of a prior distribution relative to the neuronal activity (i.e. the causative mechanism of the observed data).
Modeling long time series relative to small sample size:Developing statistical models for studying populations relevant to neuroscience often poses considerable challenges due to small sample sizes, the issue of low statistical power, and the length of the time series (i.e. number of repeated measurements) being quite long (e.g., > 100 time points). One analytic approach appropriate for the challenging scenarios previously noted are multivariate autoregressive models [2022,13]. The use of Bayesian statistical modeling is more sensitive for hypothesis testing and interval estimation than frequentist approaches when there are small sample sizes with multivariate structure and a long time series [2325,19]. In the present study, the number of time series measurements is large (i.e. > 1000) with the length of the time series greater than the sample size (N=1). Although our data acquisition involved multiple trials for each electrode within each subject, the average time series (waveform) for each site within each subject was used for BdSEM modeling.
Modeling fMRI data involves capturing the degree of deoxygenated vascularization relative to baseline for each voxel or congregation of voxels corresponding to ROIs across multiple trials (i.e. time series) nested within subjects. Each case within the timeseries data structure for each ROI corresponds to the peak, or averaged peak, of hemodynamic activity for that trial. Furthermore, after slice timing correction, the resulting data structures for each ROI are obtained contemporaneously for any given trial. When extended to multivariate regression or multivariate autoregressive models (or SEMs), mapping hypothesized effective connectivity relationships (including those with theoretical support) becomes fallacious given that inferences of causal influence of one region on another requires temporal precedence. For example, the empirical conditions for inferring a causal relationship between two variables include (a) X is related to Y, (b) X temporally precedes Y, and (c) the relationship between X and Y is not mediated by a third variable  Z [15,26]. Although more complicated, the same rules apply when modeling the relationships in a system of simultaneous equations (e.g., more than two variables).
Kim et al. [27] developed a Unified SEM technique of fMRI timeseries using data on visual attention using multivariate auto regression to model both contemporaneous and temporal (longitudinal) relationships simultaneously. However, the effective relationships determined from delayed representations and realtime representations of data were separated by a single trial with a time resolution of 3 seconds. Moreover, inferences of interregional causal relationships between multiple regions’ underlying electrophysiology were made from vascular data that was obtained 3 seconds prior. Also, Unified SEM requires relationships between parameters to be specified a priori. Gates et al. [28] improved upon this technique, by developing an Extended Unified SEM technique integrating an automatic search procedure based on Lagrangian multiplier tests, or modification indices. In this paper, we improve on these procedures by the estimation of electrophysiological connectivity maps to obtain more accurate representations of effective relationships by leveraging the high temporal and spatial resolution of (ECoG), a form of electrophysiological data collected from electrodes placed directly on the brain cortical surface during neurosurgery [2,3,5,6]. ECoG remedies issues of localization frequently encountered in EEG while retaining the advantage of high temporal resolution (0.5 – 1.0ms) and spatial resolution (diameter 2.03.0mm). The improved temporal resolution relative to fMRI makes ECoG ideal for reducing bias to very small or optimal levels specific to inferences based on underlying neurophysiology. Leveraging the high quality of ECoG data, we present a novel method for modeling dynamic data structures of ROIs to be input in a datadriven method of modeling the relationships between regions of the cortical voice network.
Differential structural equation modeling:Modeling dynamic intraindividual change using differential equations based on long time series is a rapidly evolving technique. Functional data analysis (FDA) [29] provides an approach for fitting differential equations directly from acquired measurements in studies of human growth and EEG. Using differential equations within SEM provides an approach for applying the FDA approach to model the effective connectivity expressed as dynamic coupling between the neuronal states of various brain systems when exposed to experimental conditions. Typically, in neuroimaging studies, experimental conditions are modeled as inputs via boxcar or stick functions. Here we approximate the (reciprocal) relationships among brain region activity using a bilinear approximation.
Ordinary differential equations (ODEs):Using ODEs to model neuronal activity provides a way to express the rate of change of the states as a parameterized function of the states and inputs. In Equation 2,
$\text{u}(\text{t})$
represents a particular experimental stimulus as an input. Here we use ODEs to model instantaneous changes in neuronal activity in two steps. First, we employ neuronal state equations thereby linking the derivatives of neuronal states
$\text{x}(t)=({x}_{1}(t),\dots ,{x}_{d}(t))\text{'}$
of d brain regions to themselves under the influence of an experimental condition [2]. The ODEbased dynamic model assumes a Markov property whereby instantaneous changes of the system depend only on system states and experimental inputs at the same moment in time. Importantly, the Markov property is tenable for the brain system performing simple auditory and vocalization experimental tasks as in our study. The ODEbased state equation detailing the dynamic changes of neural states is provided in Equation 2.
$\frac{\text{d}\text{x}\text{(t)}}{\text{d}t}={\text{F}}_{1}(\text{x}\text{(t)},\text{u}(\text{t}),{\text{\theta}}_{1})$
(2)
Where F1 is a set of nonlinear basis functions capturing neuronal influences that specific brain regions
$\text{x}\text{(t)}$
and experimental stimuli
$\text{u}(\text{t})$
exert on other specific regions. The vector
${\text{\theta}}_{1}$
contains the unknown parameters in the system. In Equation 2,
$\text{x}\text{(t)}$
is a continuous function reflecting an average of the neuronal activity in a specific region. Equation 3 conveys the observationlevel (output) equation that enables a description of how the underlying neuronal activity causes changes in the observed data vector
$y$
in each ROI.
$y\text{(t)=}{\text{F}}_{2}(\text{x}(\text{t}),{\theta}_{2},\text{e}(\text{t}))$
(3)
Where
${\text{F}}_{2}$
is an unknown function,
${\theta}_{2}$
are parameters to be estimated, and
$\text{e}(\text{t})$
are error terms. Equations 2 and 3 are requisite to modeling any dynamic system (i.e. they link underlying or hidden states to observable outputs, [30]). Particularly relevant to our study is that the causal relationships between the outputs and the inputs conform to a Volterra series, which expresses the outputs as a generalized convolution of the input – without reference to the hidden states
$\text{x(t)}$
[1]. In short, the Volterra series is a functional Taylor expansion of the outputs with respect to the inputs.
Next, in Equation 4 we are able to include a bilinear approximation to F1 by combining elements in Equations 2 and 3. In Equation 4, causal influences among regions are possible to elucidate since bidirectional estimates of parameters are included in the
${\text{A}}_{{i}_{1}{i}_{2}}$
and
${\text{A}}_{{i}_{1}{i}_{2}}$
portion of the equation. These bidirectional influences include the dynamic effect of the time series. The parameters in the
$B$
matrix enable the estimation of stimulusdependent effective connectivity between component regions. Parameters are allowed to vary over the time course of the data capture (ECoG signal) in order to ensure accurate approximation to F1.
$\frac{\text{d}\mathrm{x}(\text{t})}{\text{d}(\text{t})}=Ax(\text{t})+{\displaystyle \sum _{j=1}^{J}\text{u}}(\text{t})\cdot {B}_{j}x(\text{t})+Cu(\text{t})+D$
(4)
Here
$A$
=
${({\text{A}}_{{i}_{1}{i}_{2}})}_{\text{dxd}}$
with
${\text{A}}_{{i}_{1}{i}_{2}}$
denotes the effect of component
${\text{i}}_{2}$
on component
${\text{i}}_{1}$
exerted at the current state;
${B}_{j}$
=
${({\text{B}}_{\text{j,}{\text{i}}_{\text{1}}{\text{i}}_{\text{2}}})}_{\text{dxd}}\text{,j=1,\u2026,J,}$
couples the j^{th} stimulus with the neuronal states and nonzero
${\text{B}}_{\text{j,}{\text{i}}_{\text{1}}{\text{i}}_{\text{2}}}$
implies that the effect exerted by component
${\text{i}}_{2}$
on component
${\text{i}}_{1}$
depends on stimulus j;
$C={({\text{C}}_{\text{ij}})}_{\text{dxJ}}$
with
${\text{C}}_{\text{ij}}$
indicated the effect of stimulus j on component i; and
$D=({\text{D}}_{\text{1,\u2026,}}{\text{D}}_{\text{d}})\text{'}$
with
${\text{D}}_{\text{i}}$
representing the intercept for component i [2].
To estimate parameters at the ROI (observation) level, we use Equation 5.
$y\text{(t)=}x\text{(t)+}e\text{(t)}$
(5)
Where
$e\text{(t)}$
is a ddimensional vector of measurement errors with mean error of zeros. Further, we assume that the distribution of errors is normally distributed with mean zero and variance
${\sigma}_{\text{i}}^{\text{2}}$
. Finally, to estimate the differential equations, we use basis function expansion as described in Ramsay & Silverman [31]. The basis function expansion approach is appropriate due to the temporally dense observations of brain regions and the average of a large number of neuron activities in the region of interest. The basis function approach allows for closed form solutions based on a large number of neural activity and the associated derivatives of the regional activity. Estimating parameters in the bilinear model is straightforward and reduces to solving d linear regression equations.
Analytic strategy:Here we use a threestep approach to achieve an optimal model of effective connectivity. First, we fit differential equations to ECoG signal data to capture instantaneous change of neuronal activity within each region of interest based on data acquired via target subdural electrode sites. Second, we developed a BdSEM to model effective connectivity of the brain system using derivatives and observational data of each ROI. Third, we use a Bayesian informationtheoretic algorithm to identify the optimal network model (i.e. the model with the highest posterior probability of being optimal). Our approach is flexible in that one can (a) use the entire ECoG electrode array to model signals from a large amount of regional activity (e.g., 48 to 96 electrodes within a specific area of the brain) propagating from dense neuronal activity, or (b) use one ECoG electrode (single channel) within the array to model a targeted brain region. The first example above begins by modeling the neuronal dynamics and is similar to Dynamic Causal Modeling (DCM, Friston et al. [32]) because the brain is viewed as a continuoustime dynamic system of neuronal activity where signals propagate upward to an observational or population level. DCM incorporates two equations: a bilinear equation capturing neuronal activity in a bidirectional, nonlinear fashion, and an observationlevel equation enabling effective connectivity modeling. The approach presented here is viewed as a special case of a DCM.
Methods
Customized highdensity electrode arrays were implanted on the pial surface of exposed cortex for all subjects. Electrode arrays consisted of 96 platinumiridium disc electrodes embedded within a silicon sheet with 5.0 mm centertocenter spacing and 3.0 mm contact diameter (AdTech, Racine, WI). Since grid placements were tailored to clinical considerations for each subject, exact placements differed. However, analyzed ROIs , namely, inferior frontal gyrus (IFG), premotor cortex (PreM), primary motor cortex (M1) and 2 regions from superior temporal gyrus (STG), posterior STG (PostSTG) and middle STG (MidSTG) were significantly covered for each subject. Moreover, our task was a sustained vocalization task (e.g., produce and sustain the vowel /a/) during recording of cortical signals (see Experimental Design section) and these regions are well known to be associated with sensory motor control of vocalization.
Subjects: Four male patients ages 31 to 47, mean (41 years) undergoing neurosurgical treatment for medically intractable epilepsy served as subjects for this study. Written consent was obtained from all subjects and all research protocols were approved by University of Iowa Human Subjects Review Board. Experiments were conducted in an electromagneticallyshielded private suite in the University of Iowa General Clinical Research Unit.
All subjects underwent comprehensive presurgical neurological examination, brain imaging, neuropsychological evaluation and audiometric testing to confirm normal hearing, speech, and language function. No anatomical lesions were detected for cortical regions of interest. Subjects underwent preoperative sodium amobarbital (Wada) testing revealing left hemispheric language dominance in 3 subjects (including s186) and bilateral dominance in one subject number. Detailed description of our electrode arrays and localization of recording sites can be found in previous studies from our lab [5,6].
Experimental design:Subjects underwent 2 blocks of vocalization and playback tasks. The vocalization task required subjects to produce and maintain vocalization of the vowel /a/ for 2 seconds at a natural conversational pitch and volume (approximately 7075 dB). The vocalization task was repeated 3050 trials with 12second breaks between the selfpaced trials. Voice sound was captured by a microphone (Beta 87C, Shure, Niles, IL) located near the subject’s mouth, amplified (10 dB gain; Ultralite MK3, MOTU, Cambridge, MA), and passed through a harmonizer (Eclipse, Eventide, Little Ferry, NJ). Auditory feedback stimuli were delivered bilaterally through insert earphones (ER4, Etymotic, Elk Grove Village, IL) placed in custom fitted, vented ear molds for each subject. A 10 dB feedback amplification gain was inserted between the voice sound and its auditory feedback to partially mask the potentially confounding effects of boneconduction. During the playback task, subjects were instructed to listen to the recorded sound signal of their same selfproduced vocalizations. The gain of the signal during playback condition was adjusted at a nearly equal level to voice feedback during vocalization block. The total duration of each block was approximately 5–8 minutes. Subjects were given short breaks (2 minutes) between successive blocks.
Data processing:Data acquired from electrode arrays consisting of 96 platinumiridium disc electrodes embedded within a silicon sheet with 5.0 mm centertocenter spacing and 3.0 mm contact diameter yielded the data structure from underlying neural populations. At the observation level, a single central electrode was selected to represent the activation for each of the 5 ROIs (a) IFG, (b) PreM, (c) M1, (d) PostSTG, and (e) MidSTG based on gross anatomical surface landmarks for each subject. Electrophysiological signals for 2second intervals were averaged across trials. We used short intervals to maintain an efficient bilinear approximation of nonlinear connectivity relationships among components. For both playback and vocalization conditions, a 500millisecond interval (obtained at 0.5ms temporal resolution) was isolated starting from onset of vocalization/playback. This short interval was necessary as the subjects were subsequently presented with altered auditory feedback for other study purposes.
Assessing inter individual versus intra individual dynamics: Developing models of effective connectivity often proceeds with the goal of creating a model that accurately captures brain dynamics for a group of subjects. Standard statistical analyses (e.g., regression, ANOVA, SEM and other general linear modeling techniques) that yield inferences within a randomlyselected population of subjects are considered to be homogeneous in all aspects specific to the research endeavor. In this scenario, model parameters are derived based on pooling covariance matrices across subjects. Based on the homogeneity of subjects assumption, populationbased inferences are made based on individual differences between subjects (whether the study design is crosssectional or timeseries based). At the heart of this approach is the study of individual differences rather than withinperson (a.k.a. intra individual change). When the goal is to model a continuoustime complex neural system that changes dynamically, applying standard statistical techniques used for studying individual differences may be unjustified [33]. Testing this assumption involves the statistical property of ergodicity [34,35]. In the present study, we evaluated the ergodic property by testing the homogeneity of the variancecovariance matrices (assembled from the timeseries vector and their derivatives for each subject) prior to pooling matrices across subjects and proceeding with a grouplevel, traditional analytic technique. Specifically, we were interested in (a) measuring the degree to which neural changes in brain activity of a subject with intractable epilepsy occurs between temporal states of the ECoG signals and (b) if subjects’ data structures were homogeneous enough to develop a single connectivity model by study condition. To this end, developing a single model of effective connectivity for the four subjects in this study may or may not accurately represent the intra individual brain dynamics of each subject. Based on the results of our test, the individual subjects’ covariance matrices were not homogeneous (statistically different at p < .001 for each pair wise comparison corrected for Type 1 error); therefore we proceeded by using right (nonlanguage dominant) hemisphere data for subject 186 under both experimental conditions (i.e. speaking and playback).
Exploratory model development:To construct a model of effective connectivity that provided optimal fit, exploratory BdSEM using ROIs and every possible path was performed. Contemporaneous relationships were extracted based on the direct connections between both realtime to realtime and delayed to delayed relationships between two ROIs. The continuous signal acquired from ECoG was discretized into 1110 observations. Intrinsic relationships were interpreted as a path from the delayed representation (derivative) of a region to its realtime representation. Lastly, longitudinal relationships were derived from the delayed (derivative) representation of one region bearing a relationship to the realtime representation of another. Model fit was iteratively improved through the use of a heuristic search strategy [36,37]. Our use of heuristics was guided by the neuroscience experience of the team, mathematical logic and computational skills. We employed a mathematicallybased approach based on lagrangian heuristic/incomplete branchandbound algorithms. Decision rules for optimal model selection was based on Burnham & Anderson [38] guidelines for BCC interpretation as the BCC0 between 02 (no credible evidence that the model should be ruled out as being the KullbackLeibler (KL; [39]) best model for the population of possible samples).
Our exploratory model fitting protocol followed guidelines established from research in the informationtheoretic and Bayesian modeling fields [40,38]. Specifically, we employed the KullbackLeibler distance measure [39] as incorporated into the information theoretic measure the BrowneCudeck Criterion (BCC, [41]), to identify the model with the highest probability of being the correct model. The BCC was developed specifically for covariance structure modeling and imposes a greater penalty for model complexity than does the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). The BCC is defined as:
$BCC=\widehat{C}+2q\frac{{\displaystyle \sum _{g=1}^{G}{b}^{(g)}}\frac{{p}^{(g)}({p}^{(g)}+3)}{{N}^{(g)}{p}^{(g)}{p}^{(g)}2}}{{\displaystyle \sum _{g=1}^{G}{b}^{(g)}({p}^{(g)}+3)}}$
(5)
Where,
$\widehat{C}$
is the minimum value of the discrepancy function,
$q$
is the number of parameters in the model,
$p$
is the number of sample moments in all groups combined,
${b}^{(g)}$
equals the total sample size (N) times the ratio of the sample size in a group
${N}^{(g)}$
to the total sample size (N),
${p}^{(g)}$
is the number of variables in an observed group,
${N}^{(g)}$
, and
$G$
is the number of groups in the model. Of particular relevance, the exploratory strategy we employed provides a mechanism for the prevention of over fitting (a particular challenge in selecting an optimal model from a very large number of competing models). Ensuring that model over fitting in heuristic specification search procedures does not occur in highdimensional data structures [37,42] is a challenging but not insurmountable. Exploratory modeling began with a null model with only the derivative for a specific ROI regressed on the respective ROI. Under this specification, the starting model representation is that each region has no significant correlation with the activity of any other region. Iterative model refinement evolved through evaluating model improvement in the likelihood ratio
${\chi}^{2}$
test, root mean square error of approximation (RMSEA; [43]) and BrowneCudeck Criterion (BCC, [41]) with successive path additions.
Bayesian SEM model development and refinement:After identification of optimal models for each study condition, BdSEM proceeded by modeling the population parameters using semiconjugate priors for θ ~ multivariate normal (~N 0, 4), Σ ~ inverseWishart (
${\eta}_{0}$
,
${S}_{0}^{1}$
; [4447,42 ]). The selection of priors was based on (a) a review of the distributional properties of the acquired ECoG time series, and (b) recommendations for using informative priors for complex models with small samples Gelman [46] and Asparouhov & Múthen [48]. Onethousand MCMC burnin iterations were used to establish convergence criteria for the joint posterior distribution of the model parameters and the criterion for acceptable posterior distribution summary estimates of parameters was set at 1.001 [42]. Bayesian estimation proceeded using the SEM facility in Mplus, version 7.3 [49]. Convergence was achieved at S=20,000 post burnin iterations after which posterior distributions were evaluated using time series, auto correlation plots, and the posterior predictive pvalues to judge the behavior of the MCMC convergence [44]. Time series and auto correlation plots revealed acceptable MCMC performance in all four subjects in left and right hemisphere models. Posterior predictive p values were acceptable for subject 186; p = .53 (vocalization condition) and .47 (auditory condition).
For the final model in our heuristic search, the BCC0 was observed as 1.22 (vocalization condition) and 2.0 (listening condition). Additionally, we used (a) an RMSEA of <.05, (b) comparative fit index (CFI) of >.95, and (c) the Bayesian information criteria (BIC) being the smallest among competing models as decision criteria. Figure 1 illustrates selected path loadings for the final right hemisphere vocalization and listening condition models for subject 186. Complete presentation of path loadings are provided in Tables 1&2.
Simulation study:In the next phase of our study, Markov chain Monte Carlo (MCMC, [50]) methods were used to examine the sampling distribution of the parameter estimates and their error structure for subject 186 under both study conditions. We conducted a simulation by evaluating the impact of sample size (N=1, 2, 3 and 5) on parameter estimation bias in vocalization and listening conditions. A byproduct of our simulation included a power analysis providing estimates for each regression path at each sample size condition. The Monte Carlo simulation facility in Mplus version 7.3 [49] was used to conduct the simulation study. The results of our simulation are presented next.
For the N=1 condition, 95% coverage was attained 21% of the time. Power analysis revealed that 32% of the parameter estimates were below a power of .85. For the N=2 condition, 95% coverage was attained 28% of the time. Power analysis revealed that 28% of the parameter estimates were below a power of .85. For the N=3 condition, 95% coverage was attained 27% of the time. Power analysis revealed that 38% of the parameter estimates were below a power of .85. For the N=5 condition, 95% coverage was attained 21% of the time. Power analysis revealed that 77% of the parameter estimates were below a power of 85 Tables 3&4.
In summary, the results of the simulation study revealed that at small sample sizes, grouplevel effective connectivity analyses are like to yield estimates with low statistical power, parameter estimates with bias greater than 5% at least 50% of the time. The 95% coverage was low ranging from 21% to 28%. The problems of low power and excessive parameter bias resolved upon reaching a sample size of N = 200,000. At this point, statistical power for all parameter estimates was greater than .95 and bias was below 5%. This pattern of findings concurs with the fully Bayesian model which also included 200,000 replications. Based on the results of our simulation, a fully Bayesian approach to modeling single subjectspecific effective connectivity using ECoG data is recommended since (a) the Bayesian probability is not based on classical frequency school of probability with random sampling of experimental units (i.e. subjects), and (b) the approach directly incorporates a Markov chain Monte Carlo (MCMC) simulation component in estimating the random parameters comprising a model. Using the Bayesian approach, prior information can be included in the parameter estimation process and Gibbs sampling within the MCMC framework can be leveraged to ensure accurate final model parameters.
Regression Paths 
Standardized Estimate 
Unstandardized Estimate 
S.E. 
SD 
p 
Relationship Type 
MidSTG 
< 
IFG 
0.28 
0.1 
0.01 
0.001 
0.01 
C 
PostSTG 
< 
IFG 
0.07 
0.03 
0.001 
0.013 
0.01 
C 
dM1 
< 
IFG 
0.21 
0 
0 
0 
*** 
D 
dPostSTG 
< 
IFG 
0.26 
0 
0 
0 
*** 
D 
M1 
< 
IFG 
0.59 
0.03 
0.001 
0.015 
0.02 
C 
dMidSTG 
< 
IFG 
0.06 
0 
0 
0 
0.09 
D 
PreM 
< 
IFG 
0.01 
0.01 
0.001 
0.011 
0.39 
C 
dPreM 
< 
IFG 
0.49 
0.01 
0 
0 
*** 
D 
dIFG 
< 
IFG 
0 
0 
0 
0.001 
0.94 
I 
dMidSTG 
< 
dIFG 
0.2 
0.05 
0.011 
0.015 
*** 
C 
dPreM 
< 
dIFG 
0.14 
0.04 
0.009 
0.165 
*** 
C 
dPostSTG 
< 
dMidSTG 
0.48 
0.72 
0.002 
0.024 
*** 
C 
dPreM 
< 
dMidSTG 
0.32 
0.14 
0.003 
0.04 
*** 
C 
dM1 
< 
dMidSTG 
0.41 
0.77 
0 
0.001 
*** 
C 
dM1 
< 
dPostSTG 
0.47 
0.6 
0.002 
0.037 
*** 
C 
dM1 
< 
dPreM 
0.22 
0.43 
0 
0.002 
*** 
C 
dPreM 
< 
dPostSTG 
0.07 
0.04 
0.003 
0.042 
0.01 
C 
dM1 
< 
dIFG 
0.08 
0.04 
0.001 
0.015 
*** 
C 
PreM 
< 
dMidSTG 
0.32 
12.25 
0.068 
1.045 
*** 
D 
PostSTG 
< 
MidSTG 
0.61 
0.93 
0.002 
0.034 
*** 
C 
dPostSTG 
< 
PreM 
0.1 
0 
0 
0.001 
*** 
D 
M1 
< 
PostSTG 
0.63 
0.71 
0.003 
0.033 
*** 
C 
PostSTG 
< 
dMidSTG 
0.05 
4.66 
0.081 
1.24 
*** 
D 
M1 
< 
dIFG 
0.1 
1.53 
0.023 
0.357 
*** 
D 
M1 
< 
dPostSTG 
0.01 
0.43 
0.082 
1.02 
0.66 
D 
MidSTG 
< 
dIFG 
0.04 
0.41 
0.015 
0.301 
0.18 
D 
M1 
< 
dMidSTG 
0.18 
10.85 
0.102 
1.484 
*** 
D 
PreM 
< 
MidSTG 
0.55 
0.16 
0.003 
0.038 
*** 
C 
M1 
< 
dPreM 
0.15 
9.43 
0.125 
1.661 
*** 
D 
PostSTG 
< 
dIFG 
0.16 
2.61 
0.023 
0.311 
*** 
D 
dPreM 
< 
MidSTG 
0.61 
0.02 
0 
0.001 
*** 
D 
dPostSTG 
< 
dIFG 
0.32 
0.12 
0.01 
0.117 
*** 
C 
PreM 
< 
dIFG 
0.26 
2.43 
0.021 
0.297 
*** 
D 
dPreM 
< 
PostSTG 
0.2 
0 
0 
0.001 
*** 
D 
dM1 
< 
PreM 
0.05 
0 
0 
0.002 
0.1 
D 
M1 
< 
PreM 
0.5 
0.73 
0.003 
0.035 
*** 
C 
M1 
< 
MidSTG 
0.55 
0.92 
0.004 
0.055 
*** 
C 
dM1 
< 
MidSTG 
0.05 
0 
0 
0.002 
0.21 
D 
dM1 
< 
PostSTG 
0.1 
0 
0 
0.001 
0.01 
D 
PreM 
< 
PostSTG 
0.2 
0.12 
0.001 
0.026 
*** 
C 
dPostSTG 
< 
PostSTG 
0.08 
0 
0 
0.001 
0 
I 
dMidSTG 
< 
MidSTG 
0.01 
0 
0 
0.001 
0.76 
I 
dM1 
< 
M1 
0.15 
0.01 
0 
0.001 
*** 
I 
dPreM 
< 
PreM 
0.11 
0 
0 
0.001 
*** 
I 
dIFG 
< 
dM1 
0.01 
0.02 
0.006 
0.09 
0.81 
C 
dPostSTG 
< 
dPreM 
0.03 
0.05 
0.003 
0.042 
0.28 
C 
dMidSTG 
< 
dPostSTG 
0 
0.01 
0.002 
0.024 
0.84 
C 
dIFG 
< 
dMidSTG 
0.01 
0.04 
0.011 
0.151 
0.78 
C 
dMidSTG 
< 
dPreM 
0 
0 
0.003 
0.04 
0.98 
C 
dlFG 
< 
dPreM 
0 
0 
0.009 
0.165 
0.98 
C 
dPostSTG 
< 
dM1 
0.03 
0 
0.004 
0.155 
0.95 
C 
dMidSTG 
< 
dM1 
0.01 
0 
0 
0.03 
0.96 
C 
dPreM 
< 
dM1 
0.43 
0.1 
0.005 
0.046 
*** 
C 
dIFG 
< 
dPostSTG 
0 
0 
0.01 
0.117 
0.99 
C 
Table 1: Bayesian SEM Estimates PlaybackListen Condition – Right Hemisphere (Subject 186).
Note: *** denotes significant at p<.001. C =contemporaneous; D = delayed; I = intrinsic. Note. Path coefficients are mean standardized regression weights (point estimates) based on marginal posterior distribution resulting from Bayesian analysis with MCMC resampling. Parameter estimates (regression weights) are modeled as random parameters. SD = posterior standard deviation of the distribution. S.E. = posterior standard error of the distribution. 95% Bayesian credible interval runs from the 2.5 percentile to the 97.5 percentile. Credible intervals do not depend on a normal (Gaussian) distribution to establish confidence limits. PPP = .36, DIC = 134.09
Regression Paths 
Standardized Estimate 
Unstandardized Estimate 
S.E. 
SD 
p 
Relationship Type 
dIFG 
< 
IFG 
0.17 
0.01 
0 
0.002 
*** 
I 
MidSTG 
< 
IFG 
0.37 
0.69 
0.003 
0.054 
*** 
C 
PostSTG 
< 
IFG 
0.35 
0.63 
0.003 
0.046 
*** 
D 
dM1 
< 
IFG 
0.57 
0.02 
0 
0.001 
*** 
D 
dPostSTG 
< 
IFG 
0.5 
0.02 
0 
0.002 
*** 
D 
M1 
< 
IFG 
0.4 
0.47 
0.002 
0.039 
*** 
D 
dMidSTG 
< 
IFG 
0.01 
0 
0 
0.002 
0.76 
D 
PreM 
< 
IFG 
0.06 
0.05 
0.002 
0.029 
0.06 
C 
dPreM 
< 
IFG 
0.35 
0.01 
0 
0.001 
*** 
D 
dMidSTG 
< 
dIFG 
0.05 
0.04 
0 
0.001 
0.1 
C 
dPreM 
< 
dIFG 
0.28 
0.2 
0.051 
0.653 
*** 
C 
dPostSTG 
< 
dMidSTG 
0.03 
0.04 
0.015 
0.013 
*** 
C 
dPreM 
< 
dMidSTG 
0.04 
0.04 
0.004 
0.064 
0.1 
C 
dM1 
< 
dMidSTG 
0 
0 
0.001 
0.031 
0.99 
C 
dM1 
< 
dPostSTG 
0.08 
0.05 
0 
0.001 
0.01 
C 
dM1 
< 
dPreM 
0.33 
0.28 
0 
0.001 
*** 
C 
dPreM 
< 
dPostSTG 
0.35 
0.29 
0 
0.001 
*** 
C 
dM1 
< 
dIFG 
0.46 
0.28 
0.027 
0.59 
*** 
C 
dMidSTG 
< 
MidSTG 
0.08 
0 
0 
0.001 
0.02 
I 
dPostSTG 
< 
PostSTG 
0.39 
0.01 
0 
0.001 
*** 
I 
dPreM 
< 
PreM 
0.04 
0 
0 
0.001 
0.14 
I 
dM1 
< 
M1 
0.04 
0 
0 
0.001 
0.23 
I 
PreM 
< 
dMidSTG 
0.67 
17.27 
0.278 
2.052 
*** 
D 
PostSTG 
< 
MidSTG 
0.49 
0.47 
0.001 
0.024 
*** 
C 
dPostSTG 
< 
PreM 
0.62 
0.03 
0.001 
0.004 
*** 
D 
M1 
< 
PostSTG 
0.57 
0.37 
0.001 
0.027 
*** 
C 
PostSTG 
< 
dMidSTG 
0.15 
7.52 
0.108 
1.794 
*** 
D 
M1 
< 
IFG 
0.17 
4.07 
0.002 
0.039 
*** 
C 
M1 
< 
dPostSTG 
0.1 
2.72 
0.049 
0.859 
0 
D 
MidSTG 
< 
IFG 
0.19 
7.05 
0.003 
0.054 
*** 
C 
M1 
< 
dMidSTG 
0 
0.12 
0.059 
0.948 
0.91 
D 
PreM 
< 
MidSTG 
0.28 
0.14 
0.001 
0.014 
*** 
C 
M1 
< 
dPreM 
0.09 
3.08 
0.067 
1.136 
0.01 
D 
PostSTG 
< 
dIFG 
0.1 
3.43 
0 
0.002 
*** 
D 
dPreM 
< 
MidSTG 
0.48 
0.01 
0 
0.001 
*** 
D 
dPostSTG 
< 
dIFG 
0.16 
0.14 
0.003 
0.041 
*** 
C 
PreM 
< 
dIFG 
0.08 
1.47 
0.051 
0.653 
0.01 
D 
dPreM 
< 
PostSTG 
0.62 
0.01 
0 
0.001 
*** 
D 
dM1 
< 
PreM 
0.31 
0.01 
0 
0.001 
*** 
D 
M1 
< 
PreM 
0.36 
0.47 
0.002 
0.04 
*** 
C 
M1 
< 
MidSTG 
0.21 
0.13 
0.001 
0.023 
*** 
C 
dM1 
< 
MidSTG 
0.01 
0 
0 
0.001 
0.74 
D 
dM1 
< 
PostSTG 
0.4 
0.01 
0 
0.001 
*** 
D 
dIFG 
< 
dMidSTG 
0.05 
0.07 
0 
0.002 
0.18 
D 
dMidSTG 
< 
dPostSTG 
0.56 
0.45 
0.015 
0.103 
*** 
C 
dPostSTG 
< 
dPreM 
0.04 
0.05 
0.003 
0.053 
0.25 
C 
dPreM 
< 
dM1 
0 
0 
0.001 
0.031 
1 
C 
dIFG 
< 
dM1 
0 
0.01 
0.004 
0.064 
0.88 
C 
dMidSTG 
< 
dPreM 
0.01 
0.01 
0.004 
0.064 
0.81 
C 
dPostSTG 
< 
dM1 
0 
0.01 
0.004 
0.054 
0.83 
C 
dMidSTG 
< 
dM1 
0 
0 
0.007 
0.068 
0.87 
C 
dIFG 
< 
dPreM 
0.03 
0.03 
0.005 
0.074 
0.57 
C 
Table 2: Bayesian SEM Estimates Vocalization Condition – Right Hemisphere (Subject 186).
Note: *** denotes significant at p<.001. C =contemporaneous; D = delayed; I = intrinsic. Note. Path coefficients are mean standardized regression weights (point estimates) based on marginal posterior distribution resulting from Bayesian analysis with MCMC resampling. Parameter estimates (regression weights) are modeled as random parameters. SD = posterior standard deviation of the distribution. S.E. = posterior standard error of the distribution. 95% Bayesian credible interval runs from the 2.5 percentile to the 97.5 percentile. Credible intervals do not depend on a normal (Gaussian) distribution to establish confidence limits. PPP = .42, DIC = 131.05.
N = 1 
N = 2 
Path 
Population 
Average 
%Bias 
95% Coverage 
Power 
Population 
Average 
%Bias 

95% Coverage 
Power 
IFG on 











dM1 
61.43 
43.05 
0.29 
0.65 
0.96 
61.43 
45.729 
0.256 
0.676 

0.999 
dIFG 
26.72 
26.14 
0.022 
0.91 
0.99 
26.715 
26.704 
0 
0.933 

1 
MidSTG 
1.99 
1.71 
0.141 
0.63 
0.98 
1.99 
1.71 
0.141 
0.63 

0.98 
dMidSTG 
2.04 
1.241 
0.392 
0.77 
0.15 
2.04 
0.783 
0.616 
0.7 

0.238 
MidSTG on 











IFG 
1.99 
1.71 
0.141 
0.63 
0.98 
1.99 
1.726 
0.133 
0.66 

1 
PostSTG on 











IFG 
0.735 
0.416 
1.566 
0.65 
0.85 
0.735 
0.422 
1.574 
0.683 

0.976 
MidSTG 
0.55 
0.51 
0.073 
0.82 
1 
0.546 
0.518 
0.051 
0.744 

1 
dMidSTG 
7.79 
9.96 
0.279 
0.84 
0.99 
7.79 
9.924 
0.274 
0.736 

1 
dIFG 
7.85 
10.93 
0.392 
0.67 
0.99 
7.847 
10.751 
0.37 
0.677 

1 
dM1 on 











IFG 
0.004 
0.007 
0.75 
0.69 
0.33 
0.004 
0.007 
0.625 
0.68 

0.332 
dMidSTG 
0.006 
0.033 
4.5 
0.864 
0.152 
0 
0 
0 
1 

0 
dPostSTG 
0.089 
0.098 
0.101 
0.93 
0.43 
0.089 
0.098 
0.101 
0.93 

0.43 
dPreM 
0.28 
0.28 
0 
0.99 
0.99 
0.28 
0.28 
0 
1 

0 
dIFG 
0.083 
0.078 
0.06 
0.9 
0.65 
0.28 
0.28 
0 
1 

0 
M1 
0.008 
0.006 
0.25 
0.656 
0.247 
0.008 
0.006 
0.25 
0.678 

0.498 
PreM 
0.054 
0.038 
0.296 
0.63 
0.99 
0.054 
0.0405 
0.25 
0.668 

1 
MidSTG 
0.019 
0.014 
0.279 
0.63 
0.915 
0.019 
0.014 
0.279 
0.669 

0.998 
PostSTG 
0.019 
0.014 
0.289 
0.64 
0.935 
0.019 
0.014 
0.247 
0.675 

0.996 
dPostSTG on 











IFG 
0.025 
0.0254 
0.016 
0.94 
1 
0.025 
0.025 
0 
0.938 

1 
dMidSTG 
0.037 
0.037 
0 
1 
0 
0.037 
0.037 
0 
1 

0 
PostSTG 
0.008 
0.009 
0.125 
0.84 
1 
0.008 
0.009 
0.125 
0.791 

1 
PreM 
0.028 
0.029 
0.032 
0.92 
0.99 
0.028 
0.029 
0.032 
0.893 

1 
dIFG 
0.083 
0.078 
0.06 
0.9 
0.65 
0.083 
0.074 
0.114 
0.861 

0.859 
dPreM 
0.098 
0.152 
0.551 
0.67 
0.69 
0.098 
0.152 
0.551 
0.671 

0.86 
dM1 
0.089 
0.089 
0 
0.93 
0.43 
0.089 
0.089 
0.002 
0.951 

0.636 
M1 on 











IFG 
0.576 
0.57 
0.01 
0.95 
1 
0.576 
0.57 
0.01 
0.937 

1 
PostSTG 
0.415 
0.422 
0.017 
0.917 
1 
0.415 
0.4214 
0.015 
0.906 

1 
dPostSTG 
2.929 
2.925 
0.001 
0.945 
0.89 
2.929 
2.914 
0.005 
0.945 

0.998 
dMidSTG 
0.207 
0.23 
0.111 
0.946 
0.06 
0.207 
0.189 
0.086 
0.945 

0.061 
dPreM 
0.543 
0.575 
0.059 
0.955 
0.085 
0.543 
0.529 
0.027 
0.953 

0.116 
PreM 
0.437 
0.459 
0.05 
0.848 
1 
0.437 
0.458 
0.048 
0.78 

1 
MidSTG 
0.051 
0.059 
0.157 
0.896 
0.567 
0.051 
0.0577 
0.131 
0.863 

0.721 
dMidSTG on 











IFG 
0.006 
0.005 
0.167 
0.943 
0.641 
0.006 
0.005 
0.167 
0.949 

0.901 
dIFG 
0.035 
0.035 
0 
0.99 
1 
0.035 
0.035 
0 
0.99 

1 
MidSTG 
0.003 
0.002 
0.333 
0.641 
0.388 
0.003 
0.002 
0.333 
0.669 

0.67 
dPostSTG 
0.431 
0.466 
0.081 
0.855 
0.997 
0.431 
0.43 
0.002 
0.829 

1 
dPreM 
0.01 
0.032 
2.2 
0.918 
0.083 
0.01 
0.022 
1.2 
0.925 

0.099 
dM1 
0.006 
0.033 
4.5 
0.864 
0.152 
0.006 
0.022 
2.667 
0.828 

0.199 
PreM on 











IFG 
0.267 
0.17 
0.363 
0.666 
0.633 
0.267 
0.185 
0.306 
0.681 

0.708 
dMidSTG 
20.397 
19.428 
0.048 
0.632 
1 
20.397 
18.969 
0.07 
0.669 

1 
MidSTG 
0.013 
0.033 
3.538 
0.633 
0.35 
0.013 
0.027 
3.077 
0.67 

0.331 
dIFG 
4.215 
4.018 
0.047 
0.856 
0.997 
4.215 
3.986 
0.054 
0.791 

1 
dPreM on 











IFG 
0.018 
0.023 
0.278 
0.659 
1 
0.018 
0.0237 
0.317 
0.681 

0.708 
dIFG 
0.035 
0.035 
0 
1 
0 
0.19 
0.19 
0 
1 

0 
dMidSTG 
0.04 
0.04 
0 
1 
0 
0.04 
0.04 
0 
1 

0 
dPostSTG 
0.29 
0.29 
0 
1 
0 
0.29 
0.29 
0 
1 

0 
PreM 
0 
0.002 
0 
0.637 
0.363 
0 
0.002 
0 
0.676 

0.324 
MidSTG 
0.009 
0.017 
0.922 
0.658 
1 
0.009 
0.011 
0.189 
0.674 

1 
PostSTG 
0.013 
0.0173 
0.331 
0.658 
1 
0.013 
0.0173 
0.331 
0.681 

1 
dM1 
0.28 
0.28 
0 
0.99 
0.99 
0.28 
0.28 
0 
1 

0 
dIFG on 











dM1 
0.567 
0.612 
0.079 
0.939 
0.896 
0.567 
0.609 
0.074 
0.946 

0.995 
dPreM 
0.43 
1.23 
1.86 
0.654 
0.997 
0.43 
1.222 
1.842 
0.68 

1 
Table 3: Monte Carlo Simulation – Vocalization Condition (N=1 and N=2).
Note 1: Results based on 200000 MCMC samples for 1110 discrete time points. ChiSq = 24.87 (4); RMSEA = .025; SRMR = 0.008/SD = 0.007.
Note 2: Results based on 200000 MCMC samples using 2220 discrete time points. ChiSq = 17.12 (4); RMSEA = .025; SRMR = 0.008/SD = 0.007.
N = 3 
N = 5 
Path 
Population 
Average 
%Bias 
95% Coverage 
Power 
Population 
Average 
%Bias 
95% Coverage 
Power 
dM1 
61.43 
45.285 
0.263 
0.673 
1 
61.429 
47.049 
0.234 
0.708 
1 
dIFG 
26.715 
26.734 
0.001 
0.943 
1 
26.715 
26.785 
0.003 
0.943 
1 
MidSTG 
0 
0 
0 
1 
0 
1.99 
0 
1 
1 
0 
dMidSTG 
2.04 
0.702 
0.656 
0.677 
0.323 
2.04 
0.785 
0.615 
0.709 
0.462 
MidSTG on 










IFG 
1.99 
1.7115 
0.14 
0.657 
0.999 
1.99 
1.729 
0.131 
0.704 
1 
PostSTG on 










IFG 
0.735 
0.4095 
0.443 
0.676 
0.997 
0.735 
0.446 
0.393 
0.701 
0.999 
MidSTG 
0.546 
0.519 
0.05 
0.723 
1 
0.546 
0.521 
0.046 
0.713 
1 
dMidSTG 
7.79 
10.024 
0.287 
0.682 
1 
7.79 
9.811 
0.259 
0.705 
1 
dIFG 
7.847 
10.794 
0.376 
0.66 
1 
7.847 
10.418 
0.328 
0.694 
1 
dM1 on 










IFG 
0.004 
0.006 
0.6 
0.675 
0.368 
0.004 
0.006 
0.525 
0.7 
0.398 
dMidSTG 
0 
0 
0 
1 
0 
0 
0 
0 
1 
0 
dPostSTG 
0.089 
0.098 
0.101 
0.93 
0.43 
0.089 
0.098 
0.101 
0.93 
0.43 
dPreM 
0.28 
0.28 
0 
1 
0 
0.28 
0.28 
0 
1 
0 
dIFG 
0.28 
0.28 
0 
1 
0 
0.28 
0.28 
0 
1 
0 
M1 
0.008 
0.006 
0.25 
0.658 
0.682 
0.008 
0.006 
0.25 
0.703 
0.899 
PreM 
0.054 
0.0401 
0.257 
0.667 
1 
0.054 
0.0417 
0.228 
0.702 
1 
MidSTG 
0.019 
0.014 
0.253 
0.67 
1 
0.019 
0.015 
0.226 
0.698 
1 
PostSTG 
0.019 
0.014 
0.253 
0.661 
1 
0.019 
0.015 
0.221 
0.7 
1 
dPostSTG on 










IFG 
0.025 
0.025 
0 
0.931 
1 
0.025 
0.0249 
0.004 
0.954 
1 
dMidSTG 
0.037 
0.037 
0 
1 
0 
0.037 
0.037 
0 
1 
0 
PostSTG 
0.008 
0.009 
0.125 
0.71 
1 
0.008 
0.0084 
0.05 
0.729 
1 
PreM 
0.028 
0.028 
0.004 
0.876 
1 
0.028 
0.028 
0.014 
0.854 
1 
dIFG 
0.083 
0.073 
0.117 
0.825 
0.943 
0.083 
0.074 
0.106 
0.782 
0.993 
dPreM 
0.098 
0.155 
0.579 
0.667 
0.955 
0.098 
0.15 
0.533 
0.709 
0.999 
dM1 
0.089 
0.091 
0.019 
0.947 
0.779 
0.089 
0.09 
0.016 
0.938 
0.921 
M1 on 










IFG 
0.576 
0.5707 
0.009 
0.93 
1 
0.576 
0.5715 
0.008 
0.918 
1 
PostSTG 
0.415 
0.4219 
0.017 
0.863 
1 
0.415 
0.4208 
0.014 
0.854 
1 
dPostSTG 
2.929 
2.955 
0.009 
0.936 
0.998 
2.929 
2.912 
0.006 
0.943 
1 
dMidSTG 
0.207 
0.21 
0.013 
0.95 
0.063 
0.207 
0.195 
0.059 
0.937 
0.081 
dPreM 
0.543 
0.546 
0.006 
0.949 
0.166 
0.543 
0.55 
0.013 
0.934 
0.234 
PreM 
0.437 
0.459 
0.051 
0.719 
1 
0.437 
0.455 
0.041 
0.709 
1 
MidSTG 
0.051 
0.0582 
0.141 
0.817 
0.835 
0.051 
0.0574 
0.125 
0.765 
0.943 
dMidSTG on 










IFG 
0.006 
0.0056 
0.067 
0.948 
1 
0.006 
0.0058 
0.033 
0.946 
1 
dIFG 
0.035 
0.035 
0 
1 
0 
0.035 
0.035 
0 
1 
0 
MidSTG 
0.003 
0.002 
0.367 
0.67 
0.845 
0.003 
0.002 
0.333 
0.7 
0.939 
dPostSTG 
0.431 
0.426 
0.011 
0.76 
1 
0.431 
0.432 
0.002 
0.747 
1 
dPreM 
0.01 
0.0293 
1.93 
0.9 
0.139 
0.01 
0.0226 
1.26 
0.859 
0.189 
dM1 
0.006 
0.026 
3.3 
0.787 
0.243 
0.006 
0.022 
2.667 
0.743 
0.279 
PreM on 










IFG 
0.267 
0.184 
0.31 
0.676 
0.717 
0.267 
0.194 
0.272 
0.705 
0.747 
dMidSTG 
20.397 
18.8361 
0.077 
0.661 
1 
20.397 
18.856 
0.076 
0.707 
1 
MidSTG 
0.013 
0.029 
3.238 
0.66 
0.333 
0.013 
0.024 
2.877 
0.698 
0.336 
dIFG 
4.215 
3.9623 
0.06 
0.722 
1 
4.215 
3.959 
0.061 
0.714 
1 
dPreM on 










IFG 
0.018 
0.0239 
0.328 
0.656 
1 
0.018 
0.0233 
0.294 
0.698 
0.708 
dIFG 
0.19 
0.19 
0 
1 
0 
0.19 
0.19 
0 
1 
0 
dMidSTG 
0.04 
0.04 
0 
1 
0 
0.04 
0.04 
0 
1 
0 
dPostSTG 
0.29 
0.29 
0 
1 
0 
0.29 
0.29 
0 
1 
0 
PreM 
0 
0.002 
0 
0.67 
0.333 
0 
0.002 
0 
0.698 
0.302 
MidSTG 
0.009 
0.011 
0.2 
0.68 
1 
0.009 
0.011 
0.178 
0.713 
1 
PostSTG 
0.013 
0.0174 
0.338 
0.673 
1 
0.013 
0.0169 
0.3 
0.71 
1 
dM1 
0.28 
0.28 
0 
1 
0 
0.28 
0.28 
0 
1 
0 
dIFG on 










dM1 
0.567 
0.61 
0.076 
0.929 
0.999 
0.567 
0.603 
0.063 
0.898 
1 
dPreM 
0.43 
1.259 
1.928 
0.678 
1 
0.43 
1.173 
1.728 
0.705 
1 
Table 4: Monte Carlo Simulation – Vocalization Condition (N=1 and N=2).
Note 1: Results based on 1000 MCMC samples for 3330 discrete time points. ChiSq = 24.87 (4); RMSEA = .025; SRMR = 0.008/SD = 0.007.
Note 2: Results based on 5550 MCMC samples using 1110 discrete time points. ChiSq = 34.19 (4); RMSEA = .022; SRMR = 0.007/SD = 0.007.
Results
Emergent patterns in connectivity – right hemisphere:A primary goal of the method illustrated here was to present a methodological approach for identifying emergent patterns (i.e. strength and connectivity) in effective network connectivity using ECoG data under two study conditions. Our analytic strategy provides a way to capture intrinsic, delayed and contemporaneous relations in a single, unified model. A secondary goal included evaluating the tenability of pooling covariance matrices across subjects to conduct grouplevel analyses of ECoG data to model effective connectivity. Pooling subject covariance matrices was determined as untenable and we proceeded by modeling intra individual dynamics of each subject. Examination of Figure 1 reveals unique paths and associated strengths by playbacklisten and vocalization conditions. For example, the final models by study condition were not conclusively the same, the sign and standardized path loadings were inconsistent or reversed in sign or both. The exception to this pattern of findings is in the case of some crosscovariances (i.e. lagged relationships with an ROI on itself or a lagged relationship with another ROI’s lagged relationship). However, the observed pattern(s) of lagged effects was expected and provides no further insight relative to the goals of our study.
Discussion
Modeling complex relationships of neural systems based on functional and effective connectivity poses unique challenges to neuroscientists. In this paper we presented an approach to modeling effective and intrinsic relationships between ROIs simultaneously within a functional network. A primary goal of our work included systematically addressing issues specific to effective and intrinsic connectivity in a way that yielded an accurate map of vocal sensorimotor integration within and between ROIs. This is particularly important in dealing with electrophysiological data from neurosurgical patients because of the exquisite anatomical localization and temporal resolution.
The modeling approach presented here is aimed toward increasing an understanding of the causal relationships observed among electrodes during different conditions that implicate various stages of sensorimotor vocal control. Our approach is geared to the development of network models that can explain the neural control of human vocalization. Specifically, our work is focused on understanding feed forward (FF) and feedback (FB) control mechanisms using a paradigm by which we alter auditory feedback to subjects in real time while they are vocalizing. Such data inform specific processes such as selfvoice identification and FB based error detection.
Our work proceeded by first demonstrating an approach for estimating delayed data structures relative to each ROI using differential equations. Second, realtime and delayed representations were input for each region into a baseline or null model that assumed no relationships. Next, an heuristic iterative search algorithm was used to iteratively specify directional paths between regions. Our method was demonstrated on 5 regions of the voice network using ECoG data. The approach presented here offers a systematic and comprehensive way to model effective and intrinsic relationships and appears to work well for electrocorticography (ECoG), a form of electrophysiological data collected from electrodes placed on the surface of patents' cortex during neurosurgery.
While this is not an applications paper, it is useful to determine the nature of the relations among the ROIs modeled here. In this work we used data for which subjects vocalized an “ah” at a comfortable loudness and pitch feel or listened to their own vocalizations (passive listening). Based on our previous work with fMRI, ERP and ECoG we selected regions that are implicated in sensory control of the human voice. What is novelabout the use of ECoG data is that we were able to divide STG into multiple regions in order to determine functional connectivity associated with the percolation of this region, a task that is highly challenging with fMRI or scalp based ERP data. Since we know that the STG plays a central role in the sensory motor control of the voice, this region likely serves as the hub for modulation of other brain regions [51,52]. Given this, by understanding how different regions within the area relate to one another and to the rest of the brain will shed light on this complex process. That the middle and posterior STG show differential connectivity patterns allows for the generation of neural control hypotheses that can drive our future work in a large group of subjects and help provide mechanistic information about vocalization in healthy and disease. For example, we know that patients with Parkinson’s disease have hypophonia in which they vocalize with very low loudness levels. Importantly, we know that they under scale motor output such that they report that they are speaking at normal or greater than normal loudness levels. These patients have abnormal responses to altered auditory feedback and this may be associated with abnormal STG function within the broader network of brain regions controlling vocalization.
Figure 1: Listen and Vocalization Conditions – Right Hemisphere (Subject 186).
In our fMRI and evoked response potential (ERP) studies we have shown key connectivity differences associated with alterations in auditory feedback during vocalization. Our dynamic causal modeling of ERP signals has shown important coupling properties associated with bilateral STG, inferior frontal and premotor cortices. We have extended this observations to study musicians with absolute pitch, those with relative pitch and nonmusicians showing that connectivity variations in these regions differentiated the three subject groups. While these findings are important, the ability to use ECoG data and capitalize on its temporal and spatial resolution will allow for far greater insight into the role of different regions of STG in vocal control.
Limitations:The method introduced here includes a novel and powerful approach to modeling ECoG data. However, we acknowledge several limitations that warrant investigation in future studies. First, in the present study the selection of electrodes was anatomically driven. In a future study we plan to investigate the utility of our modeling approach using a physiological responsebased electrode location strategy. Second, a larger sample of subjects will allow us to conduct a more comprehensive evaluation of our modeling framework in measuring neuronal dynamics focusing on intraindividual (singlesubject) versus inter individual (grouplevel). Third, in a related manner, using a larger sample of subjects exhibiting language dominant versus nondominant network differences, we will be able to evaluate the utility of our modeling approach in accurately capturing brain dynamics.
References
 Friston KJ, 2007 Functional Integration Ch36 in K Friston, Ashburner SJ, Kiebel TE, Nichols WD, Penny (Eds.). Statistical Parametric Mapping:The Analysis of Functional Brain Images. UK: Academic Press, London, UK.
 Zhang T, Wu J, Li F, Caffo B, Boatman RD (2015) A dynamic directional model for effective brain connectivity using electrocorticographic (ECoG) time series. Journal of the American Statistical Association 110(509): 93106.
 Freestone DR, Karoly PJ, Dragan N, Aram P, Cook MJ, et al. (2014) Estimation of effective connectivity via datadriven neural modeling. Frontiers in Neuroscience 8: 383.
 Bénar CG, Grova C, Kobayashi E, Bagshaw AP, Aghakhani Y, et al. (2006) EEGfMRI of epileptic spikes: Concordance with EEG source localization and intracranial EEG. NeuroImage 30(4): 11611170.
 Greenlee JDW, Jackson AW, Chen F, Larson CR, Oya H, et al. (2011) Human Auditory Cortical Activation during SelfVocalization. PLoS One 6(3): e14744.
 Greenlee JDW, Behroozmand R, Larson CR, Jackson AW, Chen F, et al. (2013) SensoryMotor Interactions for Vocal Pitch Monitoring in NonPrimary Human Auditory Cortex. PLoS One 8(4): e60783.
 McIntosh AR, Gonzales Lima F (1994) Structural Equation Modeling and its Application to Network Analysis in Functional Brain Imaging. Human Brain Mapping 2(12): 222.
 Price LR (2013) Analysis of Imaging Data. Chapter 9.In: TD Little (Ed). Oxford Handbook of Quantitative Methods (Vol .2) Oxford University Press, UK, pp: 175197.
 Friston KJ (1994) Functional and Effective Connectivity in Neuroimaging: A Synthesis. Human Brain Mapping 2: 5678.
 Jöreskog K, Sörbom D (1979) Advances in Factor Analysis and Structural Equation Models. Abt Books, Massachusetts, USA.
 Bentler IM, Weeks DG (1980) Linear Structural Equations with Latent Variables. Psychometrika 45(3): 289308.
 Lee, SY (2007) Structural Equation Modeling: A Bayesian Approach. NY: Wiley Series in Probability & Statistics New York, USA.
 Price LR (2012) Small sample properties of Bayesian multivariate autoregressive time series models. Structural EquationModeling 19(1): 5164.
 Price LR, Peter TF, Roger JI, Angela RL (2009) Modeling dynamic functional neuroimaging data using structural equation modeling. Structural EquationModeling 16(1): 147162.
 Pearl J (2000) Causality: Models, reasoning and inference. NY: Cambridge University Press, New York, USA.
 Dayan P, Abbott LF (2001) Theoretical neuroscience. Computational and mathematical modeling of neural systems. MIT Press 36: 3.
 Dayan P, Hinton GE, Neal NM, Zemel RS (1995) The Helmholtz machine. Neural Computation 7(5): 889904.
 Lee SY, Song XY (2004) Evaluation of Bayesian and maximum likelihood approaches in analyzing structural equation models with small sample sizes. Multivariate Behavioral Research 39(4): 653686.
 Box GEP, Jenkins GM, Reinsel GC (2008) Time series analysis: Forecasting and control (4th edn) 784.
 Lütkepohl H (2006) New introduction to multiple time series analysis. NY: Springer, New York, USA.
 West M, Harrison, J (1997) Bayesian forecasting and dynamic control.(2nd ed.), NY: SpringerVerlag, New York, USA.
 Soares TM, Gonclaves FB, Gamerman D (2009) An integrated Bayesian model for DIF analysis. Journal of Educational andBehavioral Statistics 34(3): 348377.
 Dunson DB (2000) Bayesian latent variable models for clustered mixed outcomes. Journal of the Royal Statistical SocietySeriesB 62(2): 355366.
 Scheines R, Hoijtink H, Boomsma A (1999) Bayesian estimation and testing of structural equation models. Psychometrika64(1): 3752.
 Cohen J, Cohen P, West SG, Aiken LS (2003) Applied multiple regression/correlation analysis for the behavioral sciences. Journal of Educational and Behavioral Statistics 30(2): 227229.
 Kim J, Zhu W, Chang L, Bentler PM, Ernst T ( 2007) Unified Structural Equation Modeling Approach for the Analysis of Multisubject, Multivariate Functional MRI data. Human Brain Mapping 28(2): 8593.
 Gates KM, Molenaar PC, Hillary FG, Slobounov S (2011) Extended unified SEM approach for modeling eventrelated fMRI data. Neuroimage 54(2): 11511158.
 Ramsey JO, Hooker G, Campbell D, Cao J (2007) Parameter estimation for differential equations: a generalized smoothing approach. Journal of the Royal Statistical Society SeriesB 69(5): 741796.
 Fleiss M, Lamnabhi M, LamnabhiLagarrigue, F (1983) An algebraic approach to nonlinear functional expansions. IEEE Trans Circuits Syst 30(8): 554579.
 Ramsay JO, Silverman BW (2005) Applied functional data analysis. Springer Series in Statistics, New York, USA.
 Friston KJ, Harrison L, Penny W (2003) Dynamic causal modelling. NeuroImage 19(4): 12731302.
 Molenaar PCM, Sinclair KO, Rovine MJ, Ram N, Corneal SE (2009) Analyzing developmental processes on an individual level using nonstationary time series modeling. Developmental Psychology 45(1): 260271.
 Nesselroade JR, Molenaar PCM (1999) Pooling Lagged Covariance Structures Based on Short, Multivariate Time Series for Dynamic Factor Analysis. Chapter 9 In: Hoyle (Ed). Statistical Strategies for Small Sample Research. Thousand Oaks, CA: Sage Publications, Newbury Park, California. pp. 223251.
 An HZ, Chen SG (1997) A note on the ergodicity of nonlinear autoregressive models. Statistics & Probability Letters 34(4): 365372.
 Salhi S (2009) Heuristic Search Methods. In GA Marcoulides (Ed). Modern Methods for Business Research, Quantitative Methods Series. Lawrence Erlbaum Associates, New Jersey, USA. pp. 147176.
 Hastie T, Tibshirani R, Freidman (2009) The elements of statistical learning: Data mining, inference and prediction( 2nd edn). Springer Series in Statistics. Springer Nature, New York, USA. pp. 1745.
 Burnham KP, Anderson DR (2002) Model selection and multimodal inference: A practical informationtheoretic approach (2nd edn) Statistical Theory and Methods. Springer Nature, New York, USA. pp. 1485.
 Kullback S, Leibler RA (1951) On information and sufficiency. Annals of Mathematical Statistics 22(1): 7986.
 Raftery AE (1993) Bayesian model selection in structural equation models. In Testing Structural Equation Models, KA Bollen JS Long (Ed) CA: Sage Publications, Newbury Park, California. pp. 163180.
 Browne MW, Cudeck R (1989) Singlesample crossvalidation indices for covariance structures. Multivariate Behavioral Research 24(4): 445455.
 Gelman A, Carlin JB, Stern HS, Rubin DB (2014) Bayesian data analysis(2nd ed.) Boca Raton FL: Chapman & Hall, USA.
 Steiger JH (1990) Structural model evaluation and modification: An interval estimation approach. Multivariate Behav Res 25(2): 173180.
 Hoff, PD (2009) A first course in Bayesian statistical methods. New York, NY: Springer, USA.
 Sevinç V, Ergün G (2009) Usage of different prior distributions in Bayesian vector autoregressive models. Hacettepe Journal of Mathematics and Statistics 38(1): 8593.
 Gelman A, Jakulin A, Pittau MG, Su YS (2008) A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics 2(4): 13601383.
 Anderson TW (2003) An introduction to multivariate statistical analysis (3rd edn) Wiley, New York, USA.
 http://www.statmodel.com.
 Muthén BO, Muthén L (2012) Mplus computer program version 7.3. Mplus, Los Angeles, California. pp. 1711.
 Brooks S, Gelman A, Jones G, Meng X (2011) Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC. Taylor & Francis, Boca Raton, Florida. pp. 1575.
 Cudeck R, Kleve KJ, Henly S.J (1993) A Simple GaussNewton Procedure for Covariance Structure Analysis with HighLevel Computer Languages. Psychometrika 58(2): 211232.
 Laird AR, Robbins JM, Li K, Price LR, Cykowski MD, et al. (2008) Modeling Motor Connectivity using TMS/PET and Structural Equation Modeling. Neuroimage41(2): 424436.
 Neuropsychologia (2013) 51(8): 14711480.

