A Mixed-Integer Linear Programming Formulation for Optimizing Multi-Scale Material and Energy Integration

This research presents a mathematical formulation for optimizing integration of complex industrial systems from the level of unit operations to processes, entire plants, and finally to considering industrial symbiosis opportunities between plants. The framework is constructed using mixed-integer linear programming (MILP) which exhibits rapid conversion and a global optimum with well-defined solution methods. The framework builds upon previous efforts in process integration and considers materials and energy with thermodynamic constraints imposed by formulating the heat cascade within the MILP. The model and method which form the fundamentals of process integration problems are presented, considering exchange restrictions and problem formulation across multiple time-scales to provide flexibility in solving complex design, planning, and operational problems. The work provides the fundamental problem formulation, which has not been previously presented in a comprehensive way, to provide the basis for future work, where many process integration elements can be appended to the formulation. A case study is included to demonstrate the capabilities and results for a simple, fictional, example though the framework and method are broadly applicable across scale, time, and plant complexity.


INTRODUCTION
Increasing demand for energy and raw materials necessitate commensurate improvements in material and energy efficiency in major process industries to slow the depletion of natural resources. Process integration techniques, often proposed out of necessity from industries to respond to scarcity or market conditions, have been developed to analyse process efficiency and propose solutions. Such methods are becoming increasingly important and complex due to additional objectives such as emission reductions and job creation/retention as well as additional economic, political, or environmental constraints.
Mixed-integer linear programming (MILP) is often used for system analysis and optimization as it presents a flexible and powerful method for solving large, complex problems such as the case with industrial symbiosis and process integration. Several limitations have also lead researchers toward bi-level optimization (Aviso et al., 2010), non-linear deterministic optimization (Kantor et al., 2015), or stochastic methods (Gonela et al., 2015) to address certain aspects of processing. Such extensions often require other simplifications and the development of heuristic methods for limiting the solution space and determining solution globality.
Process integration methods can be extended from the unit process or plant level to include exchange possibilities between plants to discover new opportunities for process efficiency improvements which address modern requirements for meeting environmental and social constraints while maintaining economic competitiveness. Such opportunities have traditionally been considered on an individual basis without decisionsupport from optimization tools or advanced mathematical programming methods. The opportunity to optimize potentially interconnected, complex processing systems by applying modern optimization methods is a burgeoning research area, though a holistic framework to identify optimal industrial symbiosis solutions with a mixed-integer programming approach is still lacking. Identifying solutions for a given situation is often completed manually by professionals in the area of industrial symbiosis but solutions are likely to be sub-optimal, not taking advantage of many opportunities which may exist for integrating material and energy streams between plants. The opportunities for symbiosis solutions have been identified as promising by several authors (Jacobsen, 2006;Hashimoto et al., 2010;Kantor et al., 2015), though deep analysis of the promising solutions is required to ensure that site specificities are considered and that detailed design aspects such as materials and safety are also considered.
Connections between plants, typical symbiosis solutions, can also be augmented by a superstructure of new units which provide services within or between industries, notable examples being polymerization processes to provide low-temperature heat, organic Rankine cycles to convert waste heat into electricity, or separation processes to split mixed streams into useful fractions for one or more industrial applications. Using a comprehensive MILP optimization framework identifies the best connections between plants, utilities and modifications to achieve maximum resource efficiency, minimum environmental impact, minimum total cost, or other objectives.
This work addresses the need for comprehensive optimization frameworks. Section 2 presents a review of the current state and previous work to provide comprehensive treatment of largescale integration problems and provides motivation for this work. Section 3 presents the MILP formulation of the problem to provide such a framework and section 4 presents an example case study and its solution. The future improvements of the formulation are briefly discussed in section 5 and conclusions of this work are drawn in section 6.

BACKGROUND
Process Integration (PI) is a field of research which progresses beyond heat integration to identify processes and utilities which can be optimally integrated to achieve higher efficiencies at lower cost than conventional processes by also including material and energy flows. In classical heat integration, following the concepts of Pinch Analysis (PA), the system is divided into two subsystems, namely heat source and heat sink (Klemeš and Kravanja, 2013). The reduction in heating utility requirement results in an equal cooling utility requirement reduction. Subsequently, CO 2 emissions are also reduced, which is a primary target of modern Heat Integration (HI) and PI (Dhole and Linnhoff, 1993).
The earliest work on HI is based on PA. Edward Hohmann (1971) introduced the concept of recovery pinch, which is accepted as the first systematic approach to obtain heating and cooling targets by using the feasibility table. Linnhoff (1972) and Linnhoff and Hindmarsh (1983) proposed PA using the problem table method, which has been widely used by other researchers (Maréchal and Kalitventzeff, 1996;Klemeš and Kravanja, 2013). In PA, results are typically represented using temperatureenthalpy profiles called Composite Curve (CC) and Grand Composite Curve (GCC). An example of such curves is given in Figure 1. Constructing the CCs involves combining the heat loads of hot streams (those that need cooling) by temperature interval to form the hot composite curve and the same for cold streams (those that need heating) to obtain the cold composite curve. When plotted together, the area between the two CCs represents the potential for heat recovery, where the heating requirement of the cold streams can be satisfied by the hot streams of the system. The distance between the composites at the top of the curves gives the minimum heating requirement and the one at the bottom gives the minimum cooling requirement. This information can also be combined to form the GCC, in which all streams are combined within temperature intervals to give a profile of process thermal energy requirements. From the GCC, one can read the temperatures at which heating and cooling are required and proper utilities can be proposed to satisfy those requirements.
PA can also be adapted to other flows, provided that a stream is defined with quality (temperature, concentration, material properties, etc.) and quantity (heat load, flowrate etc.) and thus has applications beyond targeting of heat recovery. The connection between mass and heat transfer led to the extension of the method to mass pinch analysis. Mass pinch analysis was first applied for reuse of solvents by El-Halwagi and Manousiouthakis (1989) and later for other material flows such as cooling water (Kim and Smith, 2001), industrial water (Wang and Smith, 1994;Olesen and Polley, 1997), and hydrogen (Towler et al., 1996;Alves and Towler, 2002;Government of Canada, 2002). Related work in the domain by Kermani et al. and Wallerand et al. also address specific problems such as optimal integration of organic Rankine cycles (Kermani et al., 2018) and heat pumps (Wallerand et al., 2018a) with industrial processes, simultaneous optimization of water and energy (Kermani et al., 2019a), and holistic industrial system design (Kermani et al., 2019b). Industrial retrofit and planning strategies using similar methods have been proposed by Bütün et al. (2018Bütün et al. ( , 2019, and approaches to heat exchanger network design with utility systems (Ciric and Floudas, 1991) including new solution strategies (Mian et al., 2016) are also of interest within this domain. Concepts of solar utilities and various storage types, together with utilities like heat FIGURE 1 | Example of a composite curve (Left) and a grand composite curve (Right), using data from Papoulias and Grossmann (1983b). pumps, complicate the targeting formulation as described by Maréchal and Kalitventzeff (2003), but yield potential benefits in both the industrial and urban settings, as investigated by several researchers (Becker, 2012;Suciu et al., 2018;Wallerand et al., 2018b).
A similar basis of PA and mathematical programming was suggested by Kastner et al. (2015) in a review of quantitative tools for exchanges between actors in industrial parks. They provided a review of tools and existing industrial park applications of exchanges, classified by type. Methods for optimizing the exchanges were also reviewed by the authors, but without recommending specific methods or frameworks for completing such studies. The authors of the review suggest that methods need to be more comprehensive and to include aspects which have been developed by different authors and research group in the literature; thereby, many aspects can be treated by a multilayer and integrated approach, such as that proposed in this work. Various aspects must still be added, but initial steps toward integrating disparate aspects are taken herein.
While providing information on the minimum heating and cooling requirements of a given system, PA also gives indications on the utilities which should be integrated. However, for complex systems in which multiple utilities can satisfy the demand, determining the optimal configuration of utilities is difficult based solely on classical PA. Mathematical Programming (MP) techniques have been adapted in the field to fill this gap (Papoulias and Grossmann, 1983a,b). MP transforms the problem into a purely mathematical form to obtain the optimal solution by closing the energy balance, considering both thermal and electrical energy. Since energy flows are directly linked to mass flows entering and leaving the system, mass balance constraints are also added to the formulation.
Maréchal and Kalitventzeff proposed a methodology taking advantage of both PA (targeting) and MP (synthesis) (Maréchal and Boursier, 1989). The method is extended by calculating the optimal utility configuration for a given system in three steps, named AGE (Analyse-Generate-Evaluate) (Maréchal and Kalitventzeff, 1996). In the Analyse step, PA was used to determine the maximum heat recovery and the minimum heating and cooling requirements. The Generate step used Mixed Integer Linear Programming (MILP) optimization to select and size utilities from a superstructure of potential technologies. The Evaluate step used a new composite curve definition, called Integrated Composite Curve (ICC), to determine the integration of a sub-system with the rest of the system. This work was further improved by including a steam network in the mathematical formulation to optimize co-generation of heat and electricity Kalitventzeff, 1997, 1999). In Maréchal and Kalitventzeff (2003), the MILP is extended to a multitime formulation to account for different operating modes at different time steps. Kermani et al. (2019b) proposed a holistic design approach based on these concepts, also including many additional aspects in a holistic design approach but without reiterating the basis problem formulation.
When the thermodynamic models or compositions of material streams are considered, the PI problem becomes nonlinear. Both deterministic and stochastic optimization methods have been used in the literature to solve the resulting Mixed Integer Non-Linear Programming (MINLP) problem. Kantor et al. (2015) formulated the problem with a single objective, combining environmental, and economic objectives by using weight factors and simplifying several operations. Li et al. (2018) proposed a block-superstructure framework using MINLP to solve a system of balanced blocks for mass, energy, and other properties. Li et al. proposed that such an approach could be used for a variety of applications but also noted that large problems become intractable without simplifying assumptions. Gassner and Maréchal (2009) proposed a solution strategy including two steps: master optimization and slave optimization. In the slave optimization, the MILP problem was solved using the formulation of Maréchal and Kalitventzeff (2003) for each iteration of the master optimization. The master optimization used the Genetic Algorithm (GA) developed by Leyland (2002), with multiple objectives: Synthetic Natural Gas (SNG) production (economic) and energy efficiency (environmental).
The fundamental problem formulation in this work, which evolved from the formulation of Maréchal (Maréchal and Boursier, 1989;Maréchal and Kalitventzeff, 1996, 1999, 2003 uses PI in the form of an MILP. The framework has been used to analyse and optimize energy systems at large scale (Moret et al., 2014;Girones et al., 2015) and smaller scale (Girardin, 2012;Henchoz, 2016). Moreover, it has proven to be an effective tool in urban system design (Rager, 2015), biomass conversion systems (Peduzzi, 2015), and several industrial sectors such as chemicals (Pouransari, 2015;Bungener, 2016), metals (Hubert et al., 2009), food (Muller, 2007;, and pulp and paper (Kermani et al., 2019a,b). (Liu et al., 2011) also provided insight into a generic problem formulation, using the term polygeneration to denote the multi-product nature of such arrangements. They propose a generic mathematical formulation to cover various systems such as commercial buildings, biomass and biological systems, and infrastructure planning. Liu et al. (2011) also mention future perspectives for the domain from a high level and some representative results; however, a detailed mathematical formulation or practical framework is not provided for engineers to follow such methods in practice. The current work complements the work of Liu et al. (2011) by also providing such a practical formulation, allowing engineers and practitioners to complete studies for real cases. Ng et al. (2015) Proposed optimization applications for industrial symbiosis with biorefinery/bioenergy applications, allocating various flows from decomposition of biomass feedstock to units yielding the best final arrangement for maximizing value from the biomass. The optimization approach differed from other studies in the domain by using a two-sided fuzzy optimization approach for the specific application explored in the work. Despite several reviews, such as Boix et al. (2015) noting the necessity of multiple objectives, Ng et al. (2015) omitted environmental and social factors in the work but noted the capacity of the model to treat uncertainty in feedstock quality and delivery scheduling constraints.
Recent scientific contributions in literature have also cautioned practitioners and researchers from relying on a single, global, optimum as there are inherent differences between solutions, which are not (or cannot) be captured by the mathematical formulation. Recent developments in this area extend to the use of parallel coordinate plots to display multiple solutions with their key settings and performance indicators. Effective use of such tools in the domain of process integration are notably demonstrated by Kermani et al. (2019b), who generated multiple industrial system designs using a holistic and systematic approach. This elucidates the need for advanced graphing and/or decision-making support tools to also be linked to computational frameworks, adding another element required to attain practicable methods for process integration solutions.
There have been several attempts to describe a comprehensive process integration framework. Bolliger (2010) described an integrated toolbox and explained the methodology behind it. Moret et al. (2016) summarized an MILP formulation in a compact form, covering the main sets, parameters, and variables. Computational burdens for large problems have led to increasingly more efficient implementations of the framework in various languages, such as Lua (Miller et al., 2009). Yoo et al. (2015) introduced such a Lua-based concept and gave a global overview; however, the available literature has not fully represented the problem and optimization structure in a comprehensive way. The previous efforts in the field, mentioned here, are often lacking a full problem formulation to focus on the novel aspect of the specific work; therefore, omitting the basis problem formulation. Kuznetsova et al. (2016) proposed a framework for application in eco-industrial park design and optimization, including many commonalities with the proposed work; however, no detailed mathematical framework is actually presented in the work, therefore inhibiting others from adopting or applying the approach. The authors proposed a sourcesink mathematical model of interconnected black-box units and specific applications, but such a framework naturally omits many exchanges and improvements to be made within plants. Tools for optimizing process design for cleaner production were reviewed by Fan et al. (2020), delving into the basis methods such as PA and P-graph approaches in addition to novel techniques such as artificial intelligence and modern first-principles modeling tools. They introduced perspectives related to modern problems and noted findings of various active researchers in the field of process optimization, specifically with respect to the environmental impact of various scenarios and technologies. They proposed a variety of applications and tools under exploration by various authors as having large potentials in the field, including advances modeling and optimization approaches such as that proposed in the current work. An optimization-based framework for industrial symbiosis was developed by Valenzuela-Venegas et al.
(2020) with application to well-known, existing, eco-industrial networks. They showed that, depending on the objective function, the implemented exchanges in existing arrangements are not necessarily optimal, highlighting the importance of methodological developments to find and implement the best solutions and not to approach the problem ad hoc. The mathematical formulation of the problem is similar to that proposed by Kantor et al. (2015), using mixed-integer non-linear programming to describe complex relationships where linear functions often introduce error. Additional effort was made to correctly model the flow of exchanged materials including detailed flow and piping calculations, though necessitating the use of non-linear constraints for correctly modeling the detailed correlations.
Development in optimization of industrial exchange networks, termed eco-industrial parks, eco-industrial networks, or simply eco-parks is also a burgeoning research field but has not been succinctly and sufficientl y addressed by recent work. Boix et al. (2015) reviewed recent work in the field, but used excessively specific search terms to identify the recent research in the field and therefore excluded many relevant publications. Regardless, this review drew attention to a variety of relevant topics in the domain and classified them by their main focus, being water, material (more generally), and energy. The authors concluded that multi-objective optimization methods are clearly required to address the complexity and potential benefits of performing such studies on potential exchange networks. The need for treating uncertainty, flexibility, operability and retrofit of energy systems was also discussed by Andiappan (2017), who reviewed the state of the art in optimization of energy systems, also relating the domain of process systems engineering to the study of industrial parks, and classifying studies based on the approach taken. Andiappan (2017) also noted various approaches for treating multi-objective optimization problems, uncertainties and design/operation decoupling, but also did not elaborate a mathematical formulation to structure and solve the model. Boix et al. (2015) also pointed to the interconnected nature of many aspects in such industrial networks and that properly formulating the mathematical problem to simultaneously optimize water, material, and energy is a key for future development. Additional gaps such as uncertainty and data collection are the subject of other publications in the field, such as Moret et al. (2017).
Such gaps as methods for simultaneous optimization of resources and energy were identified in the published literature, exhibiting a lack of thorough optimization frameworks and their formulations to consider complex industrial symbiosis problems and find novel solutions for process integration within and between industrial sites. This work, therefore, combines and builds on previous work in the field to present a comprehensive and holistic process integration framework which forms a powerful basis for solving system integration and industrial symbiosis problems in the MILP domain. This work encompasses the method and optimization framework, independent of language choice and implementation; therefore, presenting an approach and not software, as such. The extensible framework proposed herein can be facilitated by implementation in any programming language but requires coupling with an optimization tool for solving the optimization problem.

METHODOLOGY
The framework developed for this approach uses a mixed-integer linear programming formulation, balancing imprecision from linearization with benefits of global optima and rapid resolution time when compared to mixed-integer non-linear programming formulations. MILP formulations have been suggested and effectively applied by other researchers in the domain, as the problem can be effectively formulated using such an approach. Other problems in the domain of process integration, such as heat exchanger network design, determination of the minimum approach temperature for heat exchange, necessitate non-linear approaches to treat their specificities. Inclusion of complex units/operations are treated by connecting with external flowsheeting software or by the use of surrogate models, which is discussed briefly in section 3.6. Rapid resolution time is important when considering multiple solutions, as elucidated in the previous section to be increasingly relevant in this field, as long solution times inhibit this systematic generation of solutions and the associated feedback from decision-makers. Li et al. reported solution times between 1 and 24 h for several academic case studies (Li et al., 2018) using an MINLP solver, whereas larger problem sizes can yield solution times from 7 to 10 days (Kantor et al., 2015). Generating multiple solutions for large problems can therefore become computationally prohibitive using an MINLP framework, highlighting the importance of rapid solution generation using MILP. The problem formulation in this chapter is principally inspired by Maréchal and Kalitventzeff (2003), with learning from other research in the domain, summarized in section 2, including the extension of the targeting approach to include material flows, which are especially relevant for industrial symbiosis applications. Kermani et al. (2019b) proposed a holistic design approach based on these principles, exploring additional options for utility integration and a solution strategy with a genetic algorithm guiding the MILP formulation to explore a large and complex solution space with many objectives. Such a MILP sub-problem is explicitly formulated here to provide the basis for such additional work.

Definition of Sets
The MILP problem presented in this article is essentially a simultaneous sizing and scheduling optimization problem, where the objective is to minimize costs and/or environmental impact, and where the system to be optimized is represented by a number of units belonging to the set U. Units can be seen as nodes that supply, demand or convert material and energy streams (∈ S) associated with different layers (∈ L). In each layer, a constraint is imposed to close the mass or energy balance.
The set of units is further divided into process units (subset PU ⊂ U) and utility units (subset UU ⊂ U). A process unit has fixed size and operation, which means that it cannot be optimized (i.e., the associated decision variables are fixed). They typically represent a process demand (e.g., chemical plant) or service (e.g., space heating in a building) that has to be fulfilled. In contrast, utility units are not fixed and can therefore be optimized in terms of existence, size, and operation.
The different sets are represented in Figure 2 using an illustrative example of an energy system. In this example, five layers, corresponding to different resources and forms of energy, are defined (represented by different colors on the figure): Heat, Electricity, Gas, Wood, and SNG 1 . Each unit (represented by boxes) has at least one associated stream (represented by arrows) and each must belong to a layer (e.g., the stream EG_elec is associated to ElecGrid and is of layer Electricity).
Two process units, ProcessA and ProcessB, impose a demand of heat and electricity to be fulfilled by a set of utility units representing energy conversion technologies: Gasif, CHP 2 , SCHP 3 , HP 4 . The latter require inputs of resources that are provided by additional utility units (WoodRes, GasGrid, ElecGrid, and WasteHeat). Finally, excess heat that has not been used can be disposed of in the AeroCond utility unit to close the system energy balance. As mentioned, each stream s ∈ S is associated to a unit u ∈ U and layer l ∈ L, such that a subset SoU u ⊂ S containing all streams associated to unit u and a subset SoL l ⊂ S containing all streams associated to layer l can be defined. From these subsets, it is possible to deduce the subsets UoS s ⊂ U (Equation 1) and UoL l ⊂ U (Equation 2) representing the unit to which a stream s is associated, and the units that have streams of layer l associated to them, respectively.
The system is divided into sub-systems called clusters (set C) which imply the closure of certain types of balances. Hence, each unit is located inside a cluster, and the subset UoC c ⊂ U contains all units located inside cluster c ∈ C. This concept is illustrated graphically in Figure 3. Different layer types, on which different types of constraints may apply, are defined in the set LT. The different layer types defined here are mass balance (mb), resource balance (rb), and heat cascade (hc). Each layer has an associated layer type, and the subset LoT lt ⊂ L contains all layers of type lt ∈ LT. The subsets MBU l,c ⊂ U and RBU l ⊂ U, containing all units with at least one stream of type mb and rb associated to them, respectively, are obtained using (3) and (4).

Objective
Considering multiple objectives such as economics, environmental impact and social obligations, the process integration MILP problem could have many objectives, two of which are often cost and environmental impact. The goals can be achieved by minimizing one of the objectives shown here as Equation (5) for operating cost, Equation (6) for investment cost, or Equation (7) for environmental impact. The objectives mentioned here are typical selections for many problems in process integration but other objectives could be considered with an appropriate mathematical formulation. Objective selection must be completed as required for the specific user and application.
where c op,fix u,p,t and c op,var u,p,t are the fixed and variable operating costs of unit u ∈ U in period p ∈ P and time t ∈ ToP p , respectively, and t op t is the operating time of t.
where c inv,fix u and c inv,var u are the fixed and variable investment cost of unit u ∈ U in period p ∈ P, respectively. W = u∈U p∈P t∈T e fix u · y u,p + e var u · f u,p · t op t ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p (7) where e fix u and e var u are the fixed and variable impact of unit u ∈ U in period p ∈ P and time t ∈ ToP p , respectively. The impact can be defined as any quantifiable environmental impact, such as CO 2 emissions.
If environmental costs can be monetized, Equation (7) can be included in the operating and investment cost functions by adding the cost of the emission, c e , as shown in Equations (8) and (9) for operating cost and investment cost with emissions, respectively.
∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p , ∀e ∈ E The total cost can also be calculated which finds the minimum cost solution by combining the benefits from reducing the operating cost and associated capital expenditure. This total cost is calculated as shown by Equation (10) or, including the monetized environmental impact, Equation (11).
C total = C op + a · C inv (10) Where a is the annualization factor calculated by Equation (12) which is based on the interest rate d imposed by the industry or project investor and the project lifetime z, shown in Equation (12).

Sizing and Scheduling
The sizing of each unit is governed by Equation (13), where f u,p is a continuous variable representing the capacity factor, y u,p is a binary variable representing the existence of the unit, and f min u and f max u are parameters representing the minimum and maximum capacity of unit u. Sizing is defined for each period p, representing high level time scales (e.g., years) and used, for instance, to account for investment stages. Equation (13) ensures that, if a unit exists in period p, its capacity is within given boundaries which are set for each unit.
f min u · y u,p ≤ f u,p ≤ f max u · y u,p ∀u ∈ U, ∀p ∈ P The scheduling of each unit is governed by Equations (14-16), where f ′ u,p,t is a continuous variable representing the usage factor, y ′ u,p,t is a binary variable representing the activation, and l min u,p and l max u,p are parameters representing the minimum and maximum load factor of unit u. Scheduling is defined for each time t, representing lower-level time scales when compared to periods (e.g., hours) and used for operational scheduling 5 . Equation (14) ensures that, if a unit is activated in time t, its usage factor is within given boundaries defined as a fraction of the capacity f u,p of the unit. Moreover, if a unit is not activated in a given time t, its usage factor must also equal 0. This is imposed by the combination of: (a) multiplying the lower bound by the binary variable y ′ u,p,t and, (b) imposing the constraint in Equation (15). Finally, Equation (16) ensures that a unit can only be activated if it exists in period p.
u,p,t ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p (15) y ′ u,p,t ≤ y u,p ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p (16) The multiplication of y ′ u,p,t and f u,p in Equation (14) introduces a non-linearity which must be linearized to maintain the formulation within the MILP domain; therefore, the lower bound in this equation is linearized by a standard technique which requires creating a bounded auxiliary variable. For this purpose, a new decision variable v 1 u,p,t is created and Equation (14) is replaced by Equations (17-20), which are an equivalent set of linear constraints.
v 1 u,p,t ≤ l min u,p · f u,p ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p Some circumstances require the formulation to be flexible in allowing forced activation of specific units. Parameter y force u,p,t and Equation (21) have been added to force the activation of a unit in a given period and time.
y ′ u,p,t ≥ y force u,p,t ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p Given that they have a fixed size and operation, process units (u ∈ PU), have some parameters which are fixed. Hence, the following sizing parameters used in Equations (13-21) must all be set to 1: f min u , f max u , l min u,p , l max u,p , y force u,p,t .

Mass and Energy Balance
For each unit u, supplyṀ + l,u,p,t and demandṀ − l,u,p,t of a specific layer l ∈ L are calculated with Equations (22) and (23), respectively, whereṁ out l,u,p,t /ṁ in l,u,p,t represent the reference output/input flow 6 . Note that these are only defined for layers of type mb and rb. It should also be mentioned that the reference flows are originally properties of the streams. Indeed, each mb and rb stream is defined with a layer, a direction (in/out) and a positive reference flow. For these layer types, there can only be 5 set ToP p . 6ṁ out l,u,p,t ≥ 0 andṁ in l,u,p,t ≥ 0. one stream of a given layer defined per unit and the reference flow value of the unit for that layer corresponds to the value of the associated stream (and based on its direction).
∀l ∈ LoT lt : lt ∈ {mb, rb}, ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP ṗ M − l,u,p,t =ṁ in l,u,p,t · f ′ u,p,t ∀l ∈ LoT lt : lt ∈ {mb, rb}, ∀u ∈ U, ∀p ∈ P, ∀t ∈ ToP p For each layer of type mb, the balance is closed inside each cluster c, meaning that streams of this layer type cannot exchange across clusters. This is enforced by Equation (24), which ensures the equalization of supply and demand for layer l, and thus the supply of all units equals the demand of all units located inside the cluster. Conversely, for layers of type rb, the balance is closed for the whole system (Equation 25), in which case the streams can exchange across clusters. The distinction between whether a particular flow is defined in a mb or rb layer is not strictly defined and depends on the specific situation of whether a flow could be shared between clusters or not. The best usage of rb layers is for resources which can be easily transported or defined as a network such as electricity, natural gas, steam, industrial gases such as hydrogen or chloride, among others. For considering industrial symbiosis, the most flexibility is provided by defining all streams which could be exchanged as belonging to rb layers which allows the greatest number of possible connections between disparate industries, sectors or sites. Streams which are difficult or impossible to exchange should be defined in mb layers to prevent the exchange between clusters.
∀l ∈ LoT mb , ∀c ∈ C, ∀p ∈ P, ∀t ∈ ToP p u∈RBU lṀ + l,u,p,t = u∈RBU lṀ − l,u,p,t ∀l ∈ LoT rb , ∀p ∈ P, ∀t ∈ ToP p Two additional variables are defined to represent the quantity of a given layer transferred from one unit to another. x mb l,c,i,j,p,t is the quantity of layer l ∈ LoT mb that is transferred from unit i ∈ MBU l,c to unit j ∈ MBU l,c (both inside cluster c ∈ C), in period p and time t. x rb follows the same principles but applies to rb layers and their corresponding units and therefore considers transfer between units belonging to different clusters. x mb and x rb are determined by Equations (26) and (27), respectively. To ensure that units only transfer what they supply (i.e., to avoid transfer nodes), Equations (28) and (29) are added.
M + l,u,p,t + i∈MBU l,c i =u x mb l,c,i,u,p,t =Ṁ − l,u,p,t + j∈MBU l,c j =u x mb l,c,u,j,p,t (26) ∀l ∈ LoT mb , ∀c ∈ C, ∀u ∈ MBU l,c , ∀p ∈ P, ∀t ∈ ToP p M + l,u,p,t + i∈RBU l i =u x rb l,i,u,p,t =Ṁ − l,u,p,t + j∈RBU l j =u x rb l,u,j,p,t ∀l ∈ LoT rb , ∀u ∈ RBU l , ∀p ∈ P, ∀t ∈ ToP p i∈MBU l,c i =u x mb l,c,u,i,p,t ≤Ṁ + l,u,p,t ∀l ∈ LoT mb , ∀c ∈ C, ∀u ∈ MBU l,c , ∀p ∈ P, ∀t ∈ ToP p (28) i∈RBU l i =u x rb l,u,i,p,t ≤Ṁ + l,u,p,t ∀l ∈ LoT rb , ∀u ∈ RBU l , ∀p ∈ P, ∀t ∈ ToP p An illustrative example of a layer balance is given in Figure 4. It shows the mass balance for the Gas layer of type rb, in a given period and time. In this example, the unit GasGrid supplies 21 kg/s of Gas to the CHPa and CHPb units that have respective demands of 5 and 16 kg/s, hence closing the balance.ṁ out Gas,GasGrid , m out Gas,CHPa andṁ out Gas,CHPb are fixed parameters of the MILP problem, while the variables are decisions of the optimization. It should be noted that as this is an rb layer; therefore, the balance is closed over all clusters and not within clusters.

Heat Cascade
The heat cascade constraints are applied to layers of type hc. They enforce the first and the second law of thermodynamics, ensuring that the energy balance in each cluster is closed (first law) and heat is only transferred from higher temperature intervals to lower ones (second law). Implicitly, this formulation includes concepts of heat integration by allowing the cascade of heat from higher temperatures to lower temperatures within a cluster.
The streams of a heat cascade layer are divided into two subsets, HS l,c and CS l,c , for hot and cold streams, respectively and are defined according to Equations (31) and (32). Each of these streams is characterized by an inlet (T in s ) and outlet temperature 7 (T out s ), and an inlet (ḣ in s ) and outlet enthalpy (ḣ out s ). A hot stream is cooled down and therefore delivers energy to the system, while a cold stream is heated up and takes energy from the system. For each heat cascade layer and each cluster, the list of all stream inlet and outlet temperatures is extracted and ordered to generate the set of temperature intervals 8 K. Hence, T k represents the lower temperature of interval k ∈ K. Equation (32) ensures that the energy balance is satisfied in each temperature interval 9 . This is achieved by residual heatṘ l,c,k,p,t which transfers excess heat from higher temperature intervals (k) to lower temperature intervals (k − 1). As the summation is applied to the streams, a usage factor for the stream is used, which is equal to that of the unit to which the stream belongs.
−Ṙ l,c,k,p,t = 0 ∀l ∈ LoT hc , ∀c ∈ C, ∀k ∈ K, ∀p ∈ P, ∀t ∈ ToP p whereṀ C Ps =ḣ The residual heat can be a minimum of zero when no heat is transferred from the corresponding temperature interval to the lower ones. This corresponds to the pinch temperature as defined for the eponymous method. Equation (33) sets the lower bound ofṘ l,c,k,p,t .
R l,c,k,p,t 0 ∀l ∈ LoT hc , ∀c ∈ C, ∀k ∈ K, ∀p ∈ P, ∀t ∈ ToP p (33) Residual heat in the first intervalṘ l,c,1,p,t is also zero since lower temperature intervals do not exist to accept the residual heat. Similarly, since heat cannot be delivered to the n k th interval from a higher temperature interval,Ṙ l,c,n k +1,p,t is zero. Equation (34) imposes these constraints. R l,c,1,p,t = 0,Ṙ l,c,n k +1,p,t = 0 ∀l ∈ LoT hc , ∀c ∈ C, ∀p ∈ P, ∀t ∈ ToP p (34) To illustrate this point, Figure 5 represents an example of a heat cascade inside a cluster PlantA containing a process unit and four utility units, for a given period and time. In this FIGURE 4 | Illustrative example of a rb layer balance.
example, each unit has at least one hot or cold stream 10 associated with it, all of which are characterized by an inlet and outlet temperature and enthalpy parameter. Thus, it is possible to draw the hot (red) and cold (blue) composite curves that show the enthalpy-temperature characteristics of the aggregated hot and cold streams, respectively, as shown in Figure 5. Moreover, the temperature intervals, which include all temperatures of all streams, are represented on this figure. Focusing on the upper portion of the curve shows the heat cascade for a specific interval k. It can be observed that for this interval, the constraint from Equation (32) imposes that the sum of the enthalpy difference of all hot streams situated above temperature T k is equal to the sum of the enthalpy difference of all cold streams situated above temperature T k plus the heaṫ R k that is cascaded to the next interval k − 1. One can therefore understand that the pinch temperature, also shown on the graph, is where the lower bound of the constraint shown in Equation (33) is reached, and where no heat is cascaded to a lower interval.

Integration With External Software
Implementation of the formulation presented in this paper could be completed in a variety of ways. Thus, the purpose of this effort is not to provide software, but rather a method which can be implemented flexibly in any programming language. For optimization, either an MILP solver must be constructed for this purpose or an input file prepared and executed within an existing optimization platform, which is the recommended approach for researchers or practitioners who are not experienced in developing optimization algorithms. A minimalist scripting language is recommended for structuring the problem and interacting between different components such as databases, flowsheeting software, plotting, thermodynamics, reporting, and the mentioned optimization languages. The optimization problem can then be written using any language suited to the purpose and linked to the appropriate solvers. The authors have used Lua (Miller et al., 2009) as a scripting language, with the optimization problem solved using AMPL (Fourer et al., 1990) or GLPK (Free Software Foundation Inc., 2019) with existing MILP solvers provided therein.
The MILP domain allows for a certain degree of modeling flexibility but inherently is limited by the requirement of linearity. Non-linear and highly complex processing systems are often simulated using flowsheeting software or purpose-built industryspecific tools. When non-linearities are integral within a process and functions cannot be linearized by well-defined techniques, these processes can be included by connecting flowsheeting or non-linear equation solvers as black-box calculation tools. Input data are prepared and output data are received from the external software without interacting directly with the reaction scheme, systems of differential equations or other non-linearities which traditionally have required simplification in MILP models or construction of surrogate models. Most software packages allow interaction from a terminal execution command, database trigger or other method which can then be linked with the optimization. Flowsheeting software which have been tested for such communications are Belsim Vali (Belsim, 2018), Aspen Plus (Al-Malah, 2016), and gPROMS [Process Systems Enterprise , PSE]; thus, any flowsheet model constructed in these software can be solved externally to provide the reference case for the MILP. This strategy obviates the need for surrogate model construction and allows a large degree of flexibility, though poses convergence risks for complex systems. The choice of whether to employ the external simulation or use of surrogate models should be evaluated for each case depending on the problem. Thermodynamic property databases such as Refprop (NIST, 2013) or Coolprop (Bell et al., 2014) can also be connected in this way to calculate and return the appropriate properties of pure components or mixtures for a range of conditions. In each example mentioned here, external software outputs must simply be connected to appropriate variables defined in the optimization framework for seamless integration.

CASE STUDY
To illustrate how the model can be applied to industrial symbiosis studies, it has been tested on a fictional case study represented in Figure 6. As shown on this figure, the case study is composed of five process units, each representing an industrial process from a different sector. The processes are simple black-box models with basic inputs of raw materials and energy, and outputs of products, co-products and waste. It should be Highlighted that the models used for the case study are purely fictional, and included for the sole purpose of illustrating the potential of the model. Despite the simplistic models used for the illustrative case, the approach remains the same as for real applications with a different level of detail/complexity. Each process unit is in a cluster of its own, which also contains a list of utility units that are common to all industrial processes. Many are relevant for locally producing or transforming heat (heat cascade layer), but also electricity production, steam production, and waste water treatment units are also included. Cluster A also contains an additional utility unit that purifies a waste stream into a usable material stream. The raw materials used as inputs by the process units are each supplied by a utility unit, all of which are regrouped in the Resources cluster. Electricity, gas, and water are provided by utility network units included in the Networks cluster. Products of the different processes are sold to utility units in the Market cluster. Finally, unsaleable co-products and waste are sent to utility units in the Waste cluster. Except the heat cascade layer to which the hot and cold streams belong and the Biomass layer which is a mass balance layer, the layers considered in this example are resource balance layers, meaning that streams of this type can exchange between clusters.
For illustrating discovery/optimization of industrial symbiosis, some products, co-products or waste from process units can be used by others, either directly (e.g., Process B outputs Material 7, which can be used by Process C) or indirectly with an intermediate utility unit (e.g., Waste 1 produced by Process C can be upgraded to Material 3 using the Purification unit, which can then be used by Process A). Moreover, excess heat (i.e., hot streams) which is not directly valorized in the process itself can be sent to another cluster via the Steam supply unit which converts a cold stream into a resource balance layer stream (Steam), which can be sent to another cluster where it can be converted into a hot stream (with Steam intake unit) to be used by the process unit. The steam network is considered to be a closed loop which implies that all condensate is recovered and recycled.
The list of utility units and associated parameter values are given in the Appendix (Table 4). For all utility units, the minimum load factor is set to zero (l min u,p = 0) and maximum load factor to one (l max u,p = 1); therefore, they are not included in the table. For process units, all cost and emissions factors are set to zero in this example and their other sizing parameters are set to unity (see section 3). The list of streams and their characteristics are also given in the Appendix (Tables 5, 6 for hc and rs/ms streams, respectively) and are ordered by units to which they belong.
Two single objective optimizations were completed, using the case study presented herein. The objective function for the first case was to minimize the total annualized cost (including operating, maintenance and investment costs). For the second, the objective function was to minimize the total annual direct CO 2 emissions. In both cases, the time horizon considered was 1 year ( t op t = 8,760 h), and a constant operating condition was assumed over the full period. Additionally, a total cost optimization was carried out on each of the five process unit clusters independently, yielding a reference without industrial symbiosis as a baseline for comparison.
Tables 1-3 give the resulting utility unit sizes in each cluster for each optimization. The utility units can be divided into four categories; the first category includes the utilities that are required to meet the process heat requirements (boilers and gas engine) or to evacuate excess heat (cooling tower). The second category includes the units with streams that are integrated into the heat cascade of the process. These recover heat from the process for either heat pumping or electricity generation (Rankine cycle, ORC). The third category relates to transfer of heat between    clusters (steam supply and intake). The fourth category deals with material streams, which includes treatment, upgrading and disposal of waste streams and co-products. The first category of utilities are always present to provide the heat required for process operation. For economic objectives in the optimization, the gas engine is typically selected as the priority provider of heat. Excepting the maximum capacity parameter, the sizing is limited by the maximum electricity or heating demand (whichever is reached first). The reference case, where each cluster is optimized independently and heat cannot be transferred from one cluster to another, is typical of conventional plant operation. In that case, extra heat is provided by the gas boiler. Conversely, when all clusters are optimized simultaneously, heat can be exchanged via the steam network, which results in higher utilization of the gas engine Heat pump 5,000 5,000 5,000 5,000 5,000 power and less of the gas boiler, since heating and electricity supply/demand are combined from all clusters. Optimization on emissions reduces the gas consumption to zero, and all gas boilers and gas engines are replaced with biomass boilers at their maximum size to maximize electricity production. Furthermore, biomass boilers, which had a prohibitive investment cost in the total cost optimization, become the most attractive option as costs are no longer considered in the objective function. The second category of utilities (heat pumps, Rankine cycles, and ORCs) may be selected if integration with process streams is possible, and if there is an advantage to do so for the given objective function. Such opportunities can be visualized on the composite curves, which are represented for each cluster of the combined optimization in Figure 7. On Figure 7C, one can see that a heat pump is integrated with process C, while on Figure 7D, an ORC is integrated with process E. Conventional Rankine cycles are never present in costoptimal solutions because the process hot stream temperatures are too low compared to the evaporation temperature of the Rankine cycle. Hence, it would require heat from a hot utility (first category), and additional investment for this extra capacity cannot be justified. Conversely, the environmental impact optimization maximizes the use of Rankine cycles to produce as much local electricity as possible, which results in reducing the CO 2 emissions from imported electricity. The heat source for this case is biomass-derived and is thus neutral with respect to CO 2 emissions. As such, the load factors of biomass boilers are maximized, but are insufficient provide the entire electricity demand.

Unit Cluster
The integration options (second category of utilities) are in competition with the third category: the units that allow transfer of heat from one cluster to another (steam network in this case). This leads to different integration options in the independent cluster optimization and the combined cluster optimization. In the latter, greater cost savings are achieved for the overall system if excess heat from some clusters is sent to other clusters via the steam network, rather than valorized locally. The composite curves show how the steam network (supply and intake units) is integrated with the process, and how it competes with the other options. On Figures 7B,D, it can be observed that the steam network recovers heat from the process, and is distributed to process A (see Figure 7A), reducing the required size of the gas engine. It should be noted that the utilities in the second and third categories do not have an associated investment cost and that the results could be very different had they been included.
The fourth and last category of utilities deal with the treatment of material streams. Only two units of this type are considered: waste water treatment plant (WWTP) and purification. The WWTP converts a wastewater stream into a (clean) water stream using heat and electricity while also producing some biowaste. The purification unit upgrades a waste stream (Waste1) to a input material stream (Material3) and also consumes heat end electricity. Optimizing either cost or emissions, the WWTP is never selected from the superstructure. With the parameters imposed for the WWTP, it is always more beneficial to export wastewater and have it treated externally due to capital cost of the treatment unit 11 . The purification unit is selected in cost optimization, but not for emissions optimization. In the reference case, the sizing of purification is limited by the quantity of Waste1 produced by process A, and therefore some import of Material3 is still required. When clusters are optimized together, a symbiosis opportunity appears, where Waste1 produced by process A is sent to the purification unit in cluster A. In that case, import of Material3 is no longer required. This can be observed in the Sankey diagram of Figure 8.
The flow of material streams and industrial symbiosis opportunities are represented using here using Sankey diagrams to show the flow of resources between clusters 12 , and are shown in Figures 8, 9. These show, for instance, that Material5 from process E and Material7 from process B are sent to process C. It is also observed that biomass boilers in the emissions optimization are primarily supplied by biowaste resulting from process A. In reality, all three sources of biomass are equivalent in the emissions optimization, as none have associated CO 2 emissions (directly or indirectly).
The optimization was carried out using an open source MILP solver named GLPSOL, present within the GLPK software 13 . Writing the files for the solver and the optimization itself used less than one second of computational time on a laptop computer running Windows 7 with a dual-core Intel core i7 2.80 GHz processor with 12 GB of RAM. The default solver options (e.g., integrality tolerance, optimality gap, etc.) of GLPK/GLPSOL were used, which are described in the solver documentation (Free Software Foundation Inc., 2019).
While the case study presented in this section is fictitious, the approach and framework described herein is broadly applicable in realistic scenarios to find good solutions for process system integration at different scales and settings. The approach is applicable at multiple levels of detail and temporal resolution 11 With the emissions optimization, the comparison is unfair since no CO 2 emissions are associated with the WasteWaterEx unit. 12 They are, in fact, representations of the x rb and x mb variables. 13 https://www.gnu.org/software/glpk/   while considering numerous energy and material streams. This provides a flexible framework to solve real system integration problems in many domains using an optimization approach to provide valuable solutions for complex problems.

RECENT DEVELOPMENTS AND FUTURE IMPROVEMENTS
A comprehensive and succinct baseline modeling structure for site optimization and industrial symbiosis has been presented; however, additional aspects should be added to provide more flexibility in system analysis. Further improvements and additions to the methodology have already been completed or should be foreseen as additions to the basis formulation: • Mass and thermal storage: A new unit type which will allow energy or material storage between times or periods will be formulated to allow flexibility over short or long time horizons and in batch processes. The storage unit will act as a sink with a variable but bounded capacity for energy or material from a specific layer, and act as a source in a future time or period. This requires sequentiality of times, as well as a cyclic constraint to ensure that the storage level returns to its initial state in the last time/period. • Heat exchanger network design: The current model solves the heat cascade, thus enabling the identification of the best possible integration target, but without specifying the heat exchangers required to achieve it. Several methods have been proposed to design optimal heat exchanger networks, such as that proposed by Ciric and Floudas (1991), and one of the most recent should be added which will accurately account for the capital costs and configuration of the network.
• Steam networks: A steam network model will be added which allows optimization of heat loads, pressure levels and compression/expansion stages considering production of electricity through turbine expanders. • Material pinch: Pinch analysis has been successfully applied to different material streams (Government of Canada, 2002) (hydrogen, water, compressed air, etc.), where the heat load is replaced by flowrate, and temperature by quality (such as impurities hydrogen, pressure for compressed air). Such a method could also be applied in the formulation presented in this paper for common resource streams. • Transportation options and costs: Including transportation media and their associated cost would provide more accurate solutions for real symbiosis solutions considering plants which are not geographical co-located. Various transportation media would be considered (road, rail, ship, pipeline, conveyor belt), each of which would be associated to different costs and emissions as in Kantor et al. (2015). The cost and emissions would also be included in the calculation of the objective function. In the formulation presented here, x rb and x mb would be indexed over the set of transportation options with additional constraints considering the type of materials and its transportability by the different media. • Coupling with GIS: Geographical coordinates should be introduced to assign real locations with units. The location of new units could also be optimized, as this would have an impact on the transport costs/piping networks involved. Coupling the optimization structure to a GIS database is closely linked to the consideration of transportation options.
• Investment planning: Modifications on the framework presented here will be proposed to allow the optimal selection of not only what technologies to invest in from the superstructure, but also which period to make the investment and how to cascade investments to prevent shortsighted planning or to account for availability of capital investment funds. • Additional KPIs: Adding Key Performance Indicators (KPIs) that can either be calculated or added to the objective function. This can include indicators such as human health impact or job creation. Theoretically, any impact associated with technology selection from the superstructure could be included, provided that it can be linearized for application in the MILP formulation. • Optimization under uncertainty: Though often proposed to be fixed values, parameters used in the optimization are subject to uncertainty. Taking this uncertainty into account in the optimization model would not only allow the identification of the optimal solution, but the most probable optimal solution given a variety of uncertain events. For industrial symbiosis, consideration of random unit failures can also assist in understanding the risk of inter-plant dependencies. • Parametric optimization: This is a well-known method to carry out multi-objective optimization for a limited number of objectives which cannot be characterized by the same physical units, normalized or otherwise combined into a single objective. While one of the objectives is optimized, a constraint with varying bounds is applied on each of the other objectives. This technique permits the identification of many solutions (Pareto frontier) which are beneficial to some degree in unrelated areas and provides additional options for decision-makers to select from a set of optimal solutions. • Flexibility optimization: In order to take into account the variable electricity availability of renewable sources like sun or wind, a methodology to assess the robustness of a system to real time changes and variations should be developed.

CONCLUSION
In this paper, a methodology implemented using MILP was presented, which optimizes process integration opportunities on different scales, providing solutions for integration within a plant and potential industrial symbiosis options. The presented model builds from many previous contributions in process integration and industrial process optimization. It deals with both material and energy streams of different types and incorporates pinch analysis fundamentals to obey the second law of thermodynamics. Given fixed requirements by processes, the optimization solves the sizing and scheduling problem simultaneously, resulting in the optimal set of technologies selected from a superstructure to fulfil the requirements (heating, waste treatment, etc.), their size and operation. Several objective functions were proposed for use within this model, namely: operating costs, investment costs and environmental impacts of different kinds (e.g., CO 2 emissions) but other objectives could also be formulated. MILP formulations typically optimize a single objective and thus modification of the objective functions (e.g., by monetizing environmental impacts) also provides an option for assessing differing objectives within this framework. Additional methods, such as parametric optimization, are proposed in future work to consider multiple objectives without aggregation.
To illustrate the methodology, a fictional case study consisting of five industrial processes was constructed with potential for industrial symbiosis through the exchange of energy and materials. The methodology identifies the technologies necessary to fulfil the requirements of the processes in an optimal way, with respect to different objectives. It shows which utilities should supply heat to the different processes (boilers, CHPs), which utilities can be integrated with the heating profiles of the processes (heat pumps, Rankine cycles), which exchanges are best between processes and the associated technologies required to transfer and/or treat the recovered waste or co-products.
Numerous future additions to the model have been identified to improve the methodology, to better represent the systems and handle a wide variety of cases in industrial symbiosis studies. These include the addition of new types of technologies (e.g., storage, detailed steam network, heat exchanger network), the inclusion of new methodologies (e.g. material pinch), increasing temporal (investment planning) and spatial (coupling with GIS) granularity of the model and extending the possibilities of the model with multi-objective optimization (using parametric optimization) and optimization under uncertainty.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
IK and J-LR contributed equally to the conceptualizing, writing, and editing of the article. HB also contributed to allaspects of the article. FM contributed to the conceptualization and discussion.